mirror/go - go - Git Fam. Sieh

Commit Graph

Author	SHA1	Message	Date
apocelipes	c841ba3a3e	math/big: use built-in clear to simplify code Change-Id: I07c3a498ce1e462c3d1703d77e7d7824e9334651 GitHub-Last-Rev: `2ba8c4c705` GitHub-Pull-Request: golang/go#66312 Reviewed-on: https://go-review.googlesource.com/c/go/+/571636 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2024-03-14 17:02:38 +00:00
Oleksandr Redko	806ea41fce	math/rand, math/rand/v2: rename receiver variables According to the https://go.dev/wiki/CodeReviewComments#receiver-names Change-Id: Ib8bc57cf6a680e5c75d7346b74e77847945f6939 Reviewed-on: https://go-review.googlesource.com/c/go/+/568635 Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com>	2024-03-04 17:32:49 +00:00
Meng Zhuo	ad377e906a	math: add round assembly implementations on riscv64 goos: linux goarch: riscv64 pkg: math │ floor_old.bench │ floor_new.bench │ │ sec/op │ sec/op vs base │ Ceil 54.12n ± 0% 22.05n ± 0% -59.26% (p=0.000 n=10) Floor 40.80n ± 0% 22.05n ± 0% -45.96% (p=0.000 n=10) Round 20.73n ± 0% 20.74n ± 0% ~ (p=0.441 n=10) RoundToEven 24.07n ± 0% 24.07n ± 0% ~ (p=1.000 n=10) Trunc 38.73n ± 0% 22.05n ± 0% -43.07% (p=0.000 n=10) geomean 33.58n 22.17n -33.98% Change-Id: I24fb9e3bbf8146da253b6791b21377bea1afbd16 Reviewed-on: https://go-review.googlesource.com/c/go/+/504737 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: M Zhuo <mengzhuo1203@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: M Zhuo <mengzhuo1203@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Joel Sing <joel@sing.id.au>	2024-02-23 08:34:12 +00:00
Daniel Martí	5c92f43c51	math/rand/v2: use a doc link for crypto/rand It's easier to go look at its documentation when there's a link. Change-Id: Iad6c1aa1a3f4b9127dc526b4db473239329780d6 Reviewed-on: https://go-review.googlesource.com/c/go/+/563255 Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Than McIntosh <thanm@google.com>	2024-02-19 08:55:25 +00:00
Lynn Boger	4fde3ef2ac	math/big,crypto/internal/bigmod: unroll loop in addMulVVW for ppc64x This updates the assembly implementation of AddMulVVW to unroll the main loop to do 64 bytes at a time. The code for addMulVVWx is based on the same code and has also been updated to improve performance. goos: linux goarch: ppc64le pkg: crypto/internal/bigmod cpu: POWER10 │ bg.orig.out │ bg.out │ │ sec/op │ sec/op vs base │ ModAdd 116.3n ± 0% 116.9n ± 0% +0.52% (p=0.002 n=6) ModSub 111.5n ± 0% 111.5n ± 0% 0.00% (p=0.273 n=6) MontgomeryRepr 2.195µ ± 0% 1.944µ ± 0% -11.44% (p=0.002 n=6) MontgomeryMul 2.195µ ± 0% 1.943µ ± 0% -11.48% (p=0.002 n=6) ModMul 4.418µ ± 0% 3.900µ ± 0% -11.72% (p=0.002 n=6) ExpBig 5.736m ± 0% 5.117m ± 0% -10.78% (p=0.002 n=6) Exp 5.891m ± 0% 5.237m ± 0% -11.11% (p=0.002 n=6) geomean 9.901µ 9.094µ -8.15% goos: linux goarch: ppc64le pkg: math/big cpu: POWER10 │ am.orig.out │ am.out │ │ sec/op │ sec/op vs base │ AddMulVVW/1 4.456n ± 1% 3.565n ± 0% -20.00% (p=0.002 n=6) AddMulVVW/2 4.875n ± 1% 5.938n ± 1% +21.79% (p=0.002 n=6) AddMulVVW/3 5.484n ± 0% 5.693n ± 0% +3.80% (p=0.002 n=6) AddMulVVW/4 6.370n ± 0% 6.065n ± 0% -4.79% (p=0.002 n=6) AddMulVVW/5 7.321n ± 0% 7.188n ± 0% -1.82% (p=0.002 n=6) AddMulVVW/10 12.26n ± 8% 11.41n ± 0% -6.97% (p=0.002 n=6) AddMulVVW/100 100.70n ± 0% 93.58n ± 0% -7.08% (p=0.002 n=6) AddMulVVW/1000 938.6n ± 0% 845.5n ± 0% -9.92% (p=0.002 n=6) AddMulVVW/10000 9.459µ ± 0% 8.415µ ± 0% -11.04% (p=0.002 n=6) AddMulVVW/100000 94.57µ ± 0% 84.01µ ± 0% -11.16% (p=0.002 n=6) geomean 75.17n 71.21n -5.27% Change-Id: Idd79f5f02387564f4c2cc28d50b1c12bcd9a400f Reviewed-on: https://go-review.googlesource.com/c/go/+/557915 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Paul Murphy <murp@ibm.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: Filippo Valsorda <filippo@golang.org>	2024-01-25 19:32:43 +00:00
Robert Griesemer	aba18d5b67	math/big: fix uint64 overflow in nat.mulRange Compute median as a + (b-a)/2 instead of (a + b)/2. Add additional test cases. Fixes #65025. Change-Id: Ib716a1036c17f8f33f51e33cedab13512eb7e0be Reviewed-on: https://go-review.googlesource.com/c/go/+/554617 Reviewed-by: Robert Griesemer <gri@google.com> Auto-Submit: Robert Griesemer <gri@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Robert Griesemer <gri@google.com> Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>	2024-01-09 15:29:36 +00:00
Danil Timerbulatov	527829a7cb	all: remove newline characters after return statements This commit is aimed at improving the readability and consistency of the code base. Extraneous newline characters were present after some return statements, creating unnecessary separation in the code. Fixes #64610 Change-Id: Ic1b05bf11761c4dff22691c2f1c3755f66d341f7 Reviewed-on: https://go-review.googlesource.com/c/go/+/548316 Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>	2023-12-14 17:22:18 +00:00
Russ Cox	c29444ef39	math/rand, math/rand/v2: use ChaCha8 for global rand Move ChaCha8 code into internal/chacha8rand and use it to implement runtime.rand, which is used for the unseeded global source for both math/rand and math/rand/v2. This also affects the calculation of the start point for iteration over very very large maps (when the 32-bit fastrand is not big enough). The benefit is that misuse of the global random number generators in math/rand and math/rand/v2 in contexts where non-predictable randomness is important for security reasons is no longer a security problem, removing a common mistake among programmers who are unaware of the different kinds of randomness. The cost is an extra 304 bytes per thread stored in the m struct plus 2-3ns more per random uint64 due to the more sophisticated algorithm. Using PCG looks like it would cost about the same, although I haven't benchmarked that. Before this, the math/rand and math/rand/v2 global generator was wyrand (https://github.com/wangyi-fudan/wyhash). For math/rand, using wyrand instead of the Mitchell/Reeds/Thompson ALFG was justifiable, since the latter was not any better. But for math/rand/v2, the global generator really should be at least as good as one of the well-studied, specific algorithms provided directly by the package, and it's not. (Wyrand is still reasonable for scheduling and cache decisions.) Good randomness does have a cost: about twice wyrand. Also rationalize the various runtime rand references. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ bbb48afeb7.amd64 │ 5cf807d1ea.amd64 │ │ sec/op │ sec/op vs base │ ChaCha8-32 1.862n ± 2% 1.861n ± 2% ~ (p=0.825 n=20) PCG_DXSM-32 1.471n ± 1% 1.460n ± 2% ~ (p=0.153 n=20) SourceUint64-32 1.636n ± 2% 1.582n ± 1% -3.30% (p=0.000 n=20) GlobalInt64-32 2.087n ± 1% 3.663n ± 1% +75.54% (p=0.000 n=20) GlobalInt64Parallel-32 0.1042n ± 1% 0.2026n ± 1% +94.48% (p=0.000 n=20) GlobalUint64-32 2.263n ± 2% 3.724n ± 1% +64.57% (p=0.000 n=20) GlobalUint64Parallel-32 0.1019n ± 1% 0.1973n ± 1% +93.67% (p=0.000 n=20) Int64-32 1.771n ± 1% 1.774n ± 1% ~ (p=0.449 n=20) Uint64-32 1.863n ± 2% 1.866n ± 1% ~ (p=0.364 n=20) GlobalIntN1000-32 3.134n ± 3% 4.730n ± 2% +50.95% (p=0.000 n=20) IntN1000-32 2.489n ± 1% 2.489n ± 1% ~ (p=0.683 n=20) Int64N1000-32 2.521n ± 1% 2.516n ± 1% ~ (p=0.394 n=20) Int64N1e8-32 2.479n ± 1% 2.478n ± 2% ~ (p=0.743 n=20) Int64N1e9-32 2.530n ± 2% 2.514n ± 2% ~ (p=0.193 n=20) Int64N2e9-32 2.501n ± 1% 2.494n ± 1% ~ (p=0.616 n=20) Int64N1e18-32 3.227n ± 1% 3.205n ± 1% ~ (p=0.101 n=20) Int64N2e18-32 3.647n ± 1% 3.599n ± 1% ~ (p=0.019 n=20) Int64N4e18-32 5.135n ± 1% 5.069n ± 2% ~ (p=0.034 n=20) Int32N1000-32 2.657n ± 1% 2.637n ± 1% ~ (p=0.180 n=20) Int32N1e8-32 2.636n ± 1% 2.636n ± 1% ~ (p=0.763 n=20) Int32N1e9-32 2.660n ± 2% 2.638n ± 1% ~ (p=0.358 n=20) Int32N2e9-32 2.662n ± 2% 2.618n ± 2% ~ (p=0.064 n=20) Float32-32 2.272n ± 2% 2.239n ± 2% ~ (p=0.194 n=20) Float64-32 2.272n ± 1% 2.286n ± 2% ~ (p=0.763 n=20) ExpFloat64-32 3.762n ± 1% 3.744n ± 1% ~ (p=0.171 n=20) NormFloat64-32 3.706n ± 1% 3.655n ± 2% ~ (p=0.066 n=20) Perm3-32 32.93n ± 3% 34.62n ± 1% +5.13% (p=0.000 n=20) Perm30-32 202.9n ± 1% 204.0n ± 1% ~ (p=0.482 n=20) Perm30ViaShuffle-32 115.0n ± 1% 114.9n ± 1% ~ (p=0.358 n=20) ShuffleOverhead-32 112.8n ± 1% 112.7n ± 1% ~ (p=0.692 n=20) Concurrent-32 2.107n ± 0% 3.725n ± 1% +76.75% (p=0.000 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 │ bbb48afeb7.arm64 │ 5cf807d1ea.arm64 │ │ sec/op │ sec/op vs base │ ChaCha8-8 2.480n ± 0% 2.429n ± 0% -2.04% (p=0.000 n=20) PCG_DXSM-8 2.531n ± 0% 2.530n ± 0% ~ (p=0.877 n=20) SourceUint64-8 2.534n ± 0% 2.533n ± 0% ~ (p=0.732 n=20) GlobalInt64-8 2.172n ± 1% 4.794n ± 0% +120.67% (p=0.000 n=20) GlobalInt64Parallel-8 0.4320n ± 0% 0.9605n ± 0% +122.32% (p=0.000 n=20) GlobalUint64-8 2.182n ± 0% 4.770n ± 0% +118.58% (p=0.000 n=20) GlobalUint64Parallel-8 0.4307n ± 0% 0.9583n ± 0% +122.51% (p=0.000 n=20) Int64-8 4.107n ± 0% 4.104n ± 0% ~ (p=0.416 n=20) Uint64-8 4.080n ± 0% 4.080n ± 0% ~ (p=0.052 n=20) GlobalIntN1000-8 2.814n ± 2% 5.643n ± 0% +100.50% (p=0.000 n=20) IntN1000-8 4.141n ± 0% 4.139n ± 0% ~ (p=0.140 n=20) Int64N1000-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.313 n=20) Int64N1e8-8 4.140n ± 0% 4.139n ± 0% ~ (p=0.103 n=20) Int64N1e9-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.761 n=20) Int64N2e9-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.636 n=20) Int64N1e18-8 5.266n ± 0% 5.326n ± 1% +1.14% (p=0.001 n=20) Int64N2e18-8 6.052n ± 0% 6.167n ± 0% +1.90% (p=0.000 n=20) Int64N4e18-8 8.826n ± 0% 9.051n ± 0% +2.55% (p=0.000 n=20) Int32N1000-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20) Int32N1e8-8 4.126n ± 0% 4.131n ± 0% +0.12% (p=0.000 n=20) Int32N1e9-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20) Int32N2e9-8 4.132n ± 0% 4.131n ± 0% ~ (p=0.017 n=20) Float32-8 4.109n ± 0% 4.105n ± 0% ~ (p=0.379 n=20) Float64-8 4.107n ± 0% 4.106n ± 0% ~ (p=0.867 n=20) ExpFloat64-8 5.339n ± 0% 5.383n ± 0% +0.82% (p=0.000 n=20) NormFloat64-8 5.735n ± 0% 5.737n ± 1% ~ (p=0.856 n=20) Perm3-8 26.65n ± 0% 26.80n ± 1% +0.58% (p=0.000 n=20) Perm30-8 194.8n ± 1% 197.0n ± 0% +1.18% (p=0.000 n=20) Perm30ViaShuffle-8 156.6n ± 0% 157.6n ± 1% +0.61% (p=0.000 n=20) ShuffleOverhead-8 124.9n ± 0% 125.5n ± 0% +0.52% (p=0.000 n=20) Concurrent-8 2.434n ± 3% 5.066n ± 0% +108.09% (p=0.000 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ bbb48afeb7.386 │ 5cf807d1ea.386 │ │ sec/op │ sec/op vs base │ ChaCha8-32 11.295n ± 1% 4.748n ± 2% -57.96% (p=0.000 n=20) PCG_DXSM-32 7.693n ± 1% 7.738n ± 2% ~ (p=0.542 n=20) SourceUint64-32 7.658n ± 2% 7.622n ± 2% ~ (p=0.344 n=20) GlobalInt64-32 3.473n ± 2% 7.526n ± 2% +116.73% (p=0.000 n=20) GlobalInt64Parallel-32 0.3198n ± 0% 0.5444n ± 0% +70.22% (p=0.000 n=20) GlobalUint64-32 3.612n ± 0% 7.575n ± 1% +109.69% (p=0.000 n=20) GlobalUint64Parallel-32 0.3168n ± 0% 0.5403n ± 0% +70.51% (p=0.000 n=20) Int64-32 7.673n ± 2% 7.789n ± 1% ~ (p=0.122 n=20) Uint64-32 7.773n ± 1% 7.827n ± 2% ~ (p=0.920 n=20) GlobalIntN1000-32 6.268n ± 1% 9.581n ± 1% +52.87% (p=0.000 n=20) IntN1000-32 10.33n ± 2% 10.45n ± 1% ~ (p=0.233 n=20) Int64N1000-32 10.98n ± 2% 11.01n ± 1% ~ (p=0.401 n=20) Int64N1e8-32 11.19n ± 2% 10.97n ± 1% ~ (p=0.033 n=20) Int64N1e9-32 11.06n ± 1% 11.08n ± 1% ~ (p=0.498 n=20) Int64N2e9-32 11.10n ± 1% 11.01n ± 2% ~ (p=0.995 n=20) Int64N1e18-32 15.23n ± 2% 15.04n ± 1% ~ (p=0.973 n=20) Int64N2e18-32 15.89n ± 1% 15.85n ± 1% ~ (p=0.409 n=20) Int64N4e18-32 18.96n ± 2% 19.34n ± 2% ~ (p=0.048 n=20) Int32N1000-32 10.46n ± 2% 10.44n ± 2% ~ (p=0.480 n=20) Int32N1e8-32 10.46n ± 2% 10.49n ± 2% ~ (p=0.951 n=20) Int32N1e9-32 10.28n ± 2% 10.26n ± 1% ~ (p=0.431 n=20) Int32N2e9-32 10.50n ± 2% 10.44n ± 2% ~ (p=0.249 n=20) Float32-32 13.80n ± 2% 13.80n ± 2% ~ (p=0.751 n=20) Float64-32 23.55n ± 2% 23.87n ± 0% ~ (p=0.408 n=20) ExpFloat64-32 15.36n ± 1% 15.29n ± 2% ~ (p=0.316 n=20) NormFloat64-32 13.57n ± 1% 13.79n ± 1% +1.66% (p=0.005 n=20) Perm3-32 45.70n ± 2% 46.99n ± 2% +2.81% (p=0.001 n=20) Perm30-32 399.0n ± 1% 403.8n ± 1% +1.19% (p=0.006 n=20) Perm30ViaShuffle-32 349.0n ± 1% 350.4n ± 1% ~ (p=0.909 n=20) ShuffleOverhead-32 322.3n ± 1% 323.8n ± 1% ~ (p=0.410 n=20) Concurrent-32 3.331n ± 1% 7.312n ± 1% +119.50% (p=0.000 n=20) For #61716. Change-Id: Ibdddeed85c34d9ae397289dc899e04d4845f9ed2 Reviewed-on: https://go-review.googlesource.com/c/go/+/516860 Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Filippo Valsorda <filippo@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2023-12-05 20:34:30 +00:00
Russ Cox	d92434935f	math/rand/v2: add ChaCha8 This is a replay of CL 516859, after its rollback in CL 543895, with big-endian systems fixed and the tests disabled on RISC-V since the compiler is broken there (#64285). ChaCha8 provides a cryptographically strong generator alongside PCG, so that people who want stronger randomness have access to that. On systems with 128-bit vector math assembly (amd64 and arm64), ChaCha8 runs at about the same speed as PCG (25% slower on amd64, 2% faster on arm64). Fixes #64284. Change-Id: I6290bb8ace28e1aff9a61f805dbe380ccdf25b94 Reviewed-on: https://go-review.googlesource.com/c/go/+/546020 Reviewed-by: Filippo Valsorda <filippo@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2023-12-05 20:32:54 +00:00
Michael Knyszek	82fc03f9c9	Revert "math/rand/v2: add ChaCha8" This reverts commit `6382893890`. Reason for revert: Causes failures on big endian platforms and riscv64. Possibly a bug in the generic implementation. For #64284. For #64285. Change-Id: Ic1bb8533d9641fae28d0337b36d434b9a575cd7e Reviewed-on: https://go-review.googlesource.com/c/go/+/543895 Reviewed-by: Heschi Kreinick <heschi@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> TryBot-Bypass: Michael Knyszek <mknyszek@google.com>	2023-11-20 18:39:03 +00:00
Ludi Rehak	ee6b34797b	all: add floating point option for ARM targets This change introduces new options to set the floating point mode on ARM targets. The GOARM version number can optionally be followed by ',hardfloat' or ',softfloat' to select whether to use hardware instructions or software emulation for floating point computations, respectively. For example, GOARM=7,softfloat. Previously, software floating point support was limited to GOARM=5. With these options, software floating point is now extended to all ARM versions, including GOARM=6 and 7. This change also extends hardware floating point to GOARM=5. GOARM=5 defaults to softfloat and GOARM=6 and 7 default to hardfloat. For #61588 Change-Id: I23dc86fbd0733b262004a2ed001e1032cf371e94 Reviewed-on: https://go-review.googlesource.com/c/go/+/514907 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>	2023-11-20 17:19:36 +00:00
Russ Cox	6382893890	math/rand/v2: add ChaCha8 ChaCha8 provides a cryptographically strong generator alongside PCG, so that people who want stronger randomness have access to that. On systems with 128-bit vector math assembly (amd64 and arm64), ChaCha8 runs at about the same speed as PCG (25% slower on amd64, 2% faster on arm64). Obviously all the claimed benchmark variation other than the new ChaCha8 benchmark is a lie. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ afa459a2f0.amd64 │ bbb48afeb7.amd64 │ │ sec/op │ sec/op vs base │ PCG_DXSM-32 1.488n ± 2% 1.492n ± 2% ~ (p=0.309 n=20) ChaCha8-32 1.861n ± 2% SourceUint64-32 1.450n ± 3% 1.590n ± 2% +9.69% (p=0.000 n=20) GlobalInt64-32 2.067n ± 2% 2.061n ± 1% ~ (p=0.952 n=20) GlobalInt64Parallel-32 0.1044n ± 2% 0.1041n ± 1% ~ (p=0.498 n=20) GlobalUint64-32 2.085n ± 0% 2.256n ± 2% +8.23% (p=0.000 n=20) GlobalUint64Parallel-32 0.1008n ± 1% 0.1018n ± 1% ~ (p=0.041 n=20) Int64-32 1.779n ± 1% 1.779n ± 1% ~ (p=0.410 n=20) Uint64-32 1.854n ± 2% 1.882n ± 1% ~ (p=0.044 n=20) GlobalIntN1000-32 3.140n ± 3% 3.115n ± 3% ~ (p=0.673 n=20) IntN1000-32 2.496n ± 1% 2.509n ± 1% ~ (p=0.171 n=20) Int64N1000-32 2.510n ± 2% 2.493n ± 1% ~ (p=0.804 n=20) Int64N1e8-32 2.471n ± 2% 2.521n ± 1% +1.98% (p=0.003 n=20) Int64N1e9-32 2.488n ± 2% 2.506n ± 1% ~ (p=0.663 n=20) Int64N2e9-32 2.478n ± 2% 2.482n ± 2% ~ (p=0.533 n=20) Int64N1e18-32 3.088n ± 1% 3.216n ± 1% +4.15% (p=0.000 n=20) Int64N2e18-32 3.493n ± 1% 3.635n ± 2% +4.05% (p=0.000 n=20) Int64N4e18-32 5.060n ± 2% 5.122n ± 1% +1.22% (p=0.000 n=20) Int32N1000-32 2.620n ± 1% 2.672n ± 1% +2.00% (p=0.002 n=20) Int32N1e8-32 2.652n ± 0% 2.646n ± 1% ~ (p=0.743 n=20) Int32N1e9-32 2.644n ± 1% 2.660n ± 2% ~ (p=0.163 n=20) Int32N2e9-32 2.619n ± 2% 2.652n ± 1% ~ (p=0.132 n=20) Float32-32 2.261n ± 1% 2.267n ± 1% ~ (p=0.516 n=20) Float64-32 2.241n ± 2% 2.276n ± 1% ~ (p=0.080 n=20) ExpFloat64-32 3.716n ± 1% 3.779n ± 1% +1.68% (p=0.007 n=20) NormFloat64-32 3.718n ± 1% 3.747n ± 1% ~ (p=0.011 n=20) Perm3-32 34.11n ± 2% 34.23n ± 2% ~ (p=0.779 n=20) Perm30-32 200.6n ± 0% 202.3n ± 2% ~ (p=0.055 n=20) Perm30ViaShuffle-32 109.7n ± 1% 115.5n ± 2% +5.34% (p=0.000 n=20) ShuffleOverhead-32 107.2n ± 1% 113.3n ± 1% +5.74% (p=0.000 n=20) Concurrent-32 2.108n ± 6% 2.107n ± 1% ~ (p=0.448 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ afa459a2f0.arm64 │ bbb48afeb7.arm64 │ │ sec/op │ sec/op vs base │ PCG_DXSM-8 2.531n ± 0% 2.529n ± 0% ~ (p=0.586 n=20) ChaCha8-8 2.480n ± 0% SourceUint64-8 2.531n ± 0% 2.534n ± 0% ~ (p=0.227 n=20) GlobalInt64-8 2.177n ± 1% 2.173n ± 1% ~ (p=0.733 n=20) GlobalInt64Parallel-8 0.4319n ± 0% 0.4304n ± 0% -0.32% (p=0.003 n=20) GlobalUint64-8 2.185n ± 1% 2.185n ± 0% ~ (p=0.541 n=20) GlobalUint64Parallel-8 0.4295n ± 1% 0.4294n ± 0% ~ (p=0.203 n=20) Int64-8 4.104n ± 0% 4.107n ± 0% ~ (p=0.193 n=20) Uint64-8 4.080n ± 0% 4.081n ± 0% ~ (p=0.053 n=20) GlobalIntN1000-8 2.814n ± 1% 2.814n ± 0% ~ (p=0.879 n=20) IntN1000-8 4.140n ± 0% 4.141n ± 0% ~ (p=0.428 n=20) Int64N1000-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.114 n=20) Int64N1e8-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.898 n=20) Int64N1e9-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.593 n=20) Int64N2e9-8 4.140n ± 0% 4.139n ± 0% ~ (p=0.158 n=20) Int64N1e18-8 5.273n ± 0% 5.274n ± 0% ~ (p=0.308 n=20) Int64N2e18-8 6.059n ± 0% 6.058n ± 0% ~ (p=0.053 n=20) Int64N4e18-8 8.803n ± 0% 8.800n ± 0% ~ (p=0.673 n=20) Int32N1000-8 4.131n ± 0% 4.131n ± 0% ~ (p=0.342 n=20) Int32N1e8-8 4.131n ± 0% 4.131n ± 0% ~ (p=0.091 n=20) Int32N1e9-8 4.131n ± 0% 4.131n ± 0% ~ (p=0.273 n=20) Int32N2e9-8 4.131n ± 0% 4.131n ± 0% ~ (p=0.425 n=20) Float32-8 4.110n ± 0% 4.112n ± 0% ~ (p=0.203 n=20) Float64-8 4.104n ± 0% 4.106n ± 0% ~ (p=0.409 n=20) ExpFloat64-8 5.338n ± 0% 5.339n ± 0% ~ (p=0.037 n=20) NormFloat64-8 5.731n ± 0% 5.733n ± 0% ~ (p=0.692 n=20) Perm3-8 26.62n ± 0% 26.65n ± 0% +0.09% (p=0.000 n=20) Perm30-8 194.6n ± 2% 194.9n ± 0% ~ (p=0.141 n=20) Perm30ViaShuffle-8 156.4n ± 0% 156.5n ± 0% +0.06% (p=0.000 n=20) ShuffleOverhead-8 125.8n ± 0% 125.0n ± 0% -0.64% (p=0.000 n=20) Concurrent-8 2.654n ± 6% 2.441n ± 6% -8.06% (p=0.009 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ afa459a2f0.386 │ bbb48afeb7.386 │ │ sec/op │ sec/op vs base │ PCG_DXSM-32 7.793n ± 2% 7.647n ± 1% ~ (p=0.021 n=20) ChaCha8-32 11.48n ± 2% SourceUint64-32 7.680n ± 1% 7.714n ± 1% ~ (p=0.713 n=20) GlobalInt64-32 3.474n ± 3% 3.491n ± 28% ~ (p=0.337 n=20) GlobalInt64Parallel-32 0.3253n ± 0% 0.3194n ± 0% -1.81% (p=0.000 n=20) GlobalUint64-32 3.433n ± 2% 3.610n ± 2% +5.14% (p=0.000 n=20) GlobalUint64Parallel-32 0.3156n ± 0% 0.3164n ± 0% ~ (p=0.073 n=20) Int64-32 7.707n ± 1% 7.824n ± 0% +1.52% (p=0.005 n=20) Uint64-32 7.714n ± 1% 7.732n ± 2% ~ (p=0.441 n=20) GlobalIntN1000-32 6.236n ± 1% 6.176n ± 2% ~ (p=0.499 n=20) IntN1000-32 10.41n ± 1% 10.31n ± 2% ~ (p=0.782 n=20) Int64N1000-32 10.97n ± 2% 11.22n ± 2% +2.19% (p=0.002 n=20) Int64N1e8-32 10.98n ± 1% 11.07n ± 1% ~ (p=0.056 n=20) Int64N1e9-32 10.95n ± 0% 11.15n ± 2% ~ (p=0.016 n=20) Int64N2e9-32 11.11n ± 1% 11.00n ± 1% ~ (p=0.654 n=20) Int64N1e18-32 15.18n ± 2% 14.97n ± 2% ~ (p=0.387 n=20) Int64N2e18-32 15.61n ± 1% 15.91n ± 1% +1.92% (p=0.003 n=20) Int64N4e18-32 19.23n ± 2% 18.98n ± 1% ~ (p=1.000 n=20) Int32N1000-32 10.35n ± 1% 10.31n ± 2% ~ (p=0.081 n=20) Int32N1e8-32 10.33n ± 1% 10.38n ± 1% ~ (p=0.335 n=20) Int32N1e9-32 10.35n ± 1% 10.37n ± 1% ~ (p=0.497 n=20) Int32N2e9-32 10.35n ± 1% 10.41n ± 1% ~ (p=0.605 n=20) Float32-32 13.57n ± 1% 13.78n ± 2% ~ (p=0.047 n=20) Float64-32 22.95n ± 4% 23.43n ± 3% ~ (p=0.218 n=20) ExpFloat64-32 15.23n ± 2% 15.46n ± 1% ~ (p=0.095 n=20) NormFloat64-32 13.78n ± 1% 13.73n ± 2% ~ (p=0.031 n=20) Perm3-32 46.62n ± 2% 47.46n ± 2% +1.82% (p=0.004 n=20) Perm30-32 400.7n ± 1% 403.5n ± 1% ~ (p=0.098 n=20) Perm30ViaShuffle-32 350.5n ± 1% 348.1n ± 2% ~ (p=0.703 n=20) ShuffleOverhead-32 326.0n ± 2% 326.2n ± 2% ~ (p=0.440 n=20) Concurrent-32 3.290n ± 0% 3.297n ± 4% ~ (p=0.189 n=20) For #61716. Change-Id: Id2a7e1c1db0beb81f563faaefba65fe292497269 Reviewed-on: https://go-review.googlesource.com/c/go/+/516859 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Filippo Valsorda <filippo@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com>	2023-11-19 22:05:54 +00:00
Robert Griesemer	22278e3835	math/big: faster FloatPrec implementation Based on observations by Cherry Mui (see comments in CL 539299). Add new benchmark FloatPrecMixed. For #50489. name old time/op new time/op delta FloatPrecExact/1-12 129ns ± 0% 105ns ±11% -18.51% (p=0.008 n=5+5) FloatPrecExact/10-12 317ns ± 2% 283ns ± 1% -10.65% (p=0.008 n=5+5) FloatPrecExact/100-12 1.80µs ±15% 1.35µs ± 0% -25.09% (p=0.008 n=5+5) FloatPrecExact/1000-12 9.48µs ±14% 8.32µs ± 1% -12.25% (p=0.008 n=5+5) FloatPrecExact/10000-12 195µs ± 1% 191µs ± 0% -1.73% (p=0.008 n=5+5) FloatPrecExact/100000-12 7.31ms ± 1% 7.24ms ± 1% -0.99% (p=0.032 n=5+5) FloatPrecExact/1000000-12 301ms ± 3% 302ms ± 2% ~ (p=0.841 n=5+5) FloatPrecMixed/1-12 141ns ± 0% 110ns ± 3% -21.88% (p=0.008 n=5+5) FloatPrecMixed/10-12 767ns ± 0% 739ns ± 5% ~ (p=0.151 n=5+5) FloatPrecMixed/100-12 4.93µs ± 2% 3.73µs ± 1% -24.33% (p=0.008 n=5+5) FloatPrecMixed/1000-12 90.9µs ±11% 70.3µs ± 2% -22.66% (p=0.008 n=5+5) FloatPrecMixed/10000-12 2.30ms ± 0% 1.92ms ± 1% -16.41% (p=0.008 n=5+5) FloatPrecMixed/100000-12 87.1ms ± 1% 68.5ms ± 1% -21.42% (p=0.008 n=5+5) FloatPrecMixed/1000000-12 4.09s ± 1% 3.58s ± 1% -12.35% (p=0.008 n=5+5) FloatPrecInexact/1-12 92.4ns ± 0% 66.1ns ± 5% -28.41% (p=0.008 n=5+5) FloatPrecInexact/10-12 118ns ± 0% 91ns ± 1% -23.14% (p=0.016 n=5+4) FloatPrecInexact/100-12 310ns ±10% 244ns ± 1% -21.32% (p=0.008 n=5+5) FloatPrecInexact/1000-12 952ns ± 1% 828ns ± 1% -12.96% (p=0.016 n=4+5) FloatPrecInexact/10000-12 6.71µs ± 1% 6.25µs ± 3% -6.83% (p=0.008 n=5+5) FloatPrecInexact/100000-12 66.1µs ± 1% 61.2µs ± 1% -7.45% (p=0.008 n=5+5) FloatPrecInexact/1000000-12 635µs ± 2% 584µs ± 1% -7.97% (p=0.008 n=5+5) Change-Id: I3aa67b49a042814a3286ee8306fbed36709cbb6e Reviewed-on: https://go-review.googlesource.com/c/go/+/542756 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Robert Griesemer <gri@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@google.com> Auto-Submit: Robert Griesemer <gri@google.com>	2023-11-15 22:16:34 +00:00
Robert Griesemer	e14b96cb51	math/big: update comment in the implementation of FloatPrec Follow-up on CL 539299: missed to incorporate the updated comment per feedback on that CL. For #50489. Change-Id: Ib035400038b1d11532f62055b5cdb382ab75654c Reviewed-on: https://go-review.googlesource.com/c/go/+/542115 Run-TryBot: Robert Griesemer <gri@google.com> Auto-Submit: Robert Griesemer <gri@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>	2023-11-14 16:45:59 +00:00
Robert Griesemer	dd88f23a20	math/big: implement Rat.FloatPrec goos: darwin goarch: amd64 pkg: math/big cpu: Intel(R) Core(TM) i7-8700B CPU @ 3.20GHz BenchmarkFloatPrecExact/1-12 9380685 125.0 ns/op BenchmarkFloatPrecExact/10-12 3780493 321.2 ns/op BenchmarkFloatPrecExact/100-12 698272 1679 ns/op BenchmarkFloatPrecExact/1000-12 117975 9113 ns/op BenchmarkFloatPrecExact/10000-12 5913 192768 ns/op BenchmarkFloatPrecExact/100000-12 164 7401817 ns/op BenchmarkFloatPrecExact/1000000-12 4 293568523 ns/op BenchmarkFloatPrecInexact/1-12 12836612 91.26 ns/op BenchmarkFloatPrecInexact/10-12 10144908 114.9 ns/op BenchmarkFloatPrecInexact/100-12 4121931 297.3 ns/op BenchmarkFloatPrecInexact/1000-12 1275886 927.7 ns/op BenchmarkFloatPrecInexact/10000-12 170392 6546 ns/op BenchmarkFloatPrecInexact/100000-12 18307 65232 ns/op BenchmarkFloatPrecInexact/1000000-12 1701 621412 ns/op Fixes #50489. Change-Id: Ic952f00e35d42f2470ecab53df712721997eac94 Reviewed-on: https://go-review.googlesource.com/c/go/+/539299 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Robert Griesemer <gri@google.com> Run-TryBot: Robert Griesemer <gri@google.com> Reviewed-by: Robert Griesemer <gri@google.com>	2023-11-14 00:44:42 +00:00
Russ Cox	8abde68f19	math/rand/v2: delete Mitchell/Reeds source These slowdowns are because we are now using PCG instead of the Mitchell/Reeds LFSR for the benchmarks. PCG is in fact a bit slower (but generates statically far better random numbers). goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 01ff938549.amd64 │ afa459a2f0.amd64 │ │ sec/op │ sec/op vs base │ PCG_DXSM-32 1.490n ± 0% 1.488n ± 2% ~ (p=0.408 n=20) SourceUint64-32 1.352n ± 1% 1.450n ± 3% +7.21% (p=0.000 n=20) GlobalInt64-32 2.083n ± 0% 2.067n ± 2% ~ (p=0.223 n=20) GlobalInt64Parallel-32 0.1035n ± 1% 0.1044n ± 2% ~ (p=0.010 n=20) GlobalUint64-32 2.038n ± 1% 2.085n ± 0% +2.28% (p=0.000 n=20) GlobalUint64Parallel-32 0.1006n ± 1% 0.1008n ± 1% ~ (p=0.733 n=20) Int64-32 1.687n ± 2% 1.779n ± 1% +5.48% (p=0.000 n=20) Uint64-32 1.674n ± 2% 1.854n ± 2% +10.69% (p=0.000 n=20) GlobalIntN1000-32 3.135n ± 1% 3.140n ± 3% ~ (p=0.794 n=20) IntN1000-32 2.478n ± 1% 2.496n ± 1% +0.73% (p=0.006 n=20) Int64N1000-32 2.455n ± 1% 2.510n ± 2% +2.22% (p=0.000 n=20) Int64N1e8-32 2.467n ± 2% 2.471n ± 2% ~ (p=0.050 n=20) Int64N1e9-32 2.454n ± 1% 2.488n ± 2% +1.39% (p=0.000 n=20) Int64N2e9-32 2.482n ± 1% 2.478n ± 2% ~ (p=0.066 n=20) Int64N1e18-32 3.349n ± 2% 3.088n ± 1% -7.81% (p=0.000 n=20) Int64N2e18-32 3.537n ± 1% 3.493n ± 1% -1.24% (p=0.002 n=20) Int64N4e18-32 4.917n ± 0% 5.060n ± 2% +2.91% (p=0.000 n=20) Int32N1000-32 2.386n ± 1% 2.620n ± 1% +9.76% (p=0.000 n=20) Int32N1e8-32 2.366n ± 1% 2.652n ± 0% +12.11% (p=0.000 n=20) Int32N1e9-32 2.355n ± 2% 2.644n ± 1% +12.32% (p=0.000 n=20) Int32N2e9-32 2.371n ± 1% 2.619n ± 2% +10.48% (p=0.000 n=20) Float32-32 2.245n ± 2% 2.261n ± 1% ~ (p=0.625 n=20) Float64-32 2.235n ± 1% 2.241n ± 2% ~ (p=0.393 n=20) ExpFloat64-32 3.813n ± 3% 3.716n ± 1% -2.53% (p=0.000 n=20) NormFloat64-32 3.652n ± 2% 3.718n ± 1% +1.79% (p=0.006 n=20) Perm3-32 33.12n ± 3% 34.11n ± 2% ~ (p=0.021 n=20) Perm30-32 205.1n ± 1% 200.6n ± 0% -2.17% (p=0.000 n=20) Perm30ViaShuffle-32 110.8n ± 1% 109.7n ± 1% -0.99% (p=0.002 n=20) ShuffleOverhead-32 113.0n ± 1% 107.2n ± 1% -5.09% (p=0.000 n=20) Concurrent-32 2.100n ± 0% 2.108n ± 6% ~ (p=0.103 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 │ 01ff938549.arm64 │ afa459a2f0.arm64 │ │ sec/op │ sec/op vs base │ PCG_DXSM-8 2.531n ± 0% 2.531n ± 0% ~ (p=0.763 n=20) SourceUint64-8 2.258n ± 1% 2.531n ± 0% +12.09% (p=0.000 n=20) GlobalInt64-8 2.167n ± 0% 2.177n ± 1% ~ (p=0.213 n=20) GlobalInt64Parallel-8 0.4310n ± 0% 0.4319n ± 0% ~ (p=0.027 n=20) GlobalUint64-8 2.182n ± 1% 2.185n ± 1% ~ (p=0.683 n=20) GlobalUint64Parallel-8 0.4297n ± 0% 0.4295n ± 1% ~ (p=0.941 n=20) Int64-8 2.472n ± 1% 4.104n ± 0% +66.00% (p=0.000 n=20) Uint64-8 2.449n ± 1% 4.080n ± 0% +66.60% (p=0.000 n=20) GlobalIntN1000-8 2.814n ± 2% 2.814n ± 1% ~ (p=0.972 n=20) IntN1000-8 2.998n ± 2% 4.140n ± 0% +38.09% (p=0.000 n=20) Int64N1000-8 2.949n ± 2% 4.139n ± 0% +40.35% (p=0.000 n=20) Int64N1e8-8 2.953n ± 2% 4.140n ± 0% +40.22% (p=0.000 n=20) Int64N1e9-8 2.950n ± 0% 4.139n ± 0% +40.32% (p=0.000 n=20) Int64N2e9-8 2.946n ± 2% 4.140n ± 0% +40.53% (p=0.000 n=20) Int64N1e18-8 3.779n ± 1% 5.273n ± 0% +39.52% (p=0.000 n=20) Int64N2e18-8 4.370n ± 1% 6.059n ± 0% +38.65% (p=0.000 n=20) Int64N4e18-8 6.544n ± 1% 8.803n ± 0% +34.52% (p=0.000 n=20) Int32N1000-8 2.950n ± 0% 4.131n ± 0% +40.06% (p=0.000 n=20) Int32N1e8-8 2.950n ± 2% 4.131n ± 0% +40.03% (p=0.000 n=20) Int32N1e9-8 2.951n ± 2% 4.131n ± 0% +39.99% (p=0.000 n=20) Int32N2e9-8 2.950n ± 2% 4.131n ± 0% +40.03% (p=0.000 n=20) Float32-8 3.441n ± 0% 4.110n ± 0% +19.44% (p=0.000 n=20) Float64-8 3.442n ± 0% 4.104n ± 0% +19.24% (p=0.000 n=20) ExpFloat64-8 4.481n ± 0% 5.338n ± 0% +19.11% (p=0.000 n=20) NormFloat64-8 4.725n ± 0% 5.731n ± 0% +21.28% (p=0.000 n=20) Perm3-8 26.55n ± 0% 26.62n ± 0% +0.28% (p=0.000 n=20) Perm30-8 181.9n ± 0% 194.6n ± 2% +6.98% (p=0.000 n=20) Perm30ViaShuffle-8 142.9n ± 0% 156.4n ± 0% +9.45% (p=0.000 n=20) ShuffleOverhead-8 120.8n ± 2% 125.8n ± 0% +4.10% (p=0.000 n=20) Concurrent-8 2.421n ± 6% 2.654n ± 6% +9.67% (p=0.002 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 01ff938549.386 │ afa459a2f0.386 │ │ sec/op │ sec/op vs base │ PCG_DXSM-32 7.613n ± 1% 7.793n ± 2% +2.38% (p=0.000 n=20) SourceUint64-32 2.069n ± 0% 7.680n ± 1% +271.19% (p=0.000 n=20) GlobalInt64-32 3.456n ± 1% 3.474n ± 3% ~ (p=0.654 n=20) GlobalInt64Parallel-32 0.3252n ± 0% 0.3253n ± 0% ~ (p=0.952 n=20) GlobalUint64-32 3.573n ± 1% 3.433n ± 2% -3.92% (p=0.000 n=20) GlobalUint64Parallel-32 0.3159n ± 0% 0.3156n ± 0% ~ (p=0.223 n=20) Int64-32 2.562n ± 2% 7.707n ± 1% +200.74% (p=0.000 n=20) Uint64-32 2.592n ± 0% 7.714n ± 1% +197.65% (p=0.000 n=20) GlobalIntN1000-32 6.266n ± 2% 6.236n ± 1% ~ (p=0.039 n=20) IntN1000-32 4.724n ± 2% 10.410n ± 1% +120.39% (p=0.000 n=20) Int64N1000-32 5.490n ± 2% 10.975n ± 2% +99.89% (p=0.000 n=20) Int64N1e8-32 5.513n ± 2% 10.980n ± 1% +99.15% (p=0.000 n=20) Int64N1e9-32 5.476n ± 1% 10.950n ± 0% +99.96% (p=0.000 n=20) Int64N2e9-32 5.501n ± 2% 11.110n ± 1% +101.96% (p=0.000 n=20) Int64N1e18-32 9.043n ± 2% 15.180n ± 2% +67.86% (p=0.000 n=20) Int64N2e18-32 9.601n ± 2% 15.610n ± 1% +62.60% (p=0.000 n=20) Int64N4e18-32 12.00n ± 1% 19.23n ± 2% +60.14% (p=0.000 n=20) Int32N1000-32 4.829n ± 2% 10.345n ± 1% +114.25% (p=0.000 n=20) Int32N1e8-32 4.825n ± 2% 10.330n ± 1% +114.09% (p=0.000 n=20) Int32N1e9-32 4.830n ± 2% 10.350n ± 1% +114.26% (p=0.000 n=20) Int32N2e9-32 4.750n ± 2% 10.345n ± 1% +117.81% (p=0.000 n=20) Float32-32 10.89n ± 4% 13.57n ± 1% +24.61% (p=0.000 n=20) Float64-32 19.60n ± 4% 22.95n ± 4% +17.12% (p=0.000 n=20) ExpFloat64-32 12.96n ± 3% 15.23n ± 2% +17.47% (p=0.000 n=20) NormFloat64-32 7.516n ± 1% 13.780n ± 1% +83.34% (p=0.000 n=20) Perm3-32 36.78n ± 2% 46.62n ± 2% +26.72% (p=0.000 n=20) Perm30-32 238.9n ± 2% 400.7n ± 1% +67.73% (p=0.000 n=20) Perm30ViaShuffle-32 189.7n ± 2% 350.5n ± 1% +84.79% (p=0.000 n=20) ShuffleOverhead-32 159.8n ± 1% 326.0n ± 2% +104.01% (p=0.000 n=20) Concurrent-32 3.286n ± 1% 3.290n ± 0% ~ (p=0.743 n=20) On the other hand, compared to the original "update benchmarks" CL, the cleanups we've made more than compensate for PCG being a bit slower than LFSR, at least on 64-bit x86. ARM64 (Apple M1) is a bit slower: perhaps the 64x64→128 multiply is slower there for some reason. 386 is noticeably slower, but it's also a non-SSA backend. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.amd64 │ afa459a2f0.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.555n ± 1% 1.450n ± 3% -6.78% (p=0.000 n=20) GlobalInt64-32 2.071n ± 1% 2.067n ± 2% ~ (p=0.673 n=20) GlobalInt63Parallel-32 0.1023n ± 1% GlobalInt64Parallel-32 0.1044n ± 2% GlobalUint64-32 5.193n ± 1% 2.085n ± 0% -59.86% (p=0.000 n=20) GlobalUint64Parallel-32 0.2341n ± 0% 0.1008n ± 1% -56.93% (p=0.000 n=20) Int64-32 2.056n ± 2% 1.779n ± 1% -13.47% (p=0.000 n=20) Uint64-32 2.077n ± 2% 1.854n ± 2% -10.74% (p=0.000 n=20) GlobalIntN1000-32 4.077n ± 2% 3.140n ± 3% -22.98% (p=0.000 n=20) IntN1000-32 3.476n ± 2% 2.496n ± 1% -28.19% (p=0.000 n=20) Int64N1000-32 3.059n ± 1% 2.510n ± 2% -17.96% (p=0.000 n=20) Int64N1e8-32 2.942n ± 1% 2.471n ± 2% -15.98% (p=0.000 n=20) Int64N1e9-32 2.932n ± 1% 2.488n ± 2% -15.14% (p=0.000 n=20) Int64N2e9-32 2.925n ± 1% 2.478n ± 2% -15.30% (p=0.000 n=20) Int64N1e18-32 3.116n ± 1% 3.088n ± 1% ~ (p=0.013 n=20) Int64N2e18-32 4.067n ± 1% 3.493n ± 1% -14.11% (p=0.000 n=20) Int64N4e18-32 4.054n ± 1% 5.060n ± 2% +24.80% (p=0.000 n=20) Int32N1000-32 2.951n ± 1% 2.620n ± 1% -11.22% (p=0.000 n=20) Int32N1e8-32 3.102n ± 1% 2.652n ± 0% -14.50% (p=0.000 n=20) Int32N1e9-32 3.535n ± 1% 2.644n ± 1% -25.20% (p=0.000 n=20) Int32N2e9-32 3.514n ± 1% 2.619n ± 2% -25.47% (p=0.000 n=20) Float32-32 2.760n ± 1% 2.261n ± 1% -18.06% (p=0.000 n=20) Float64-32 2.284n ± 1% 2.241n ± 2% ~ (p=0.016 n=20) ExpFloat64-32 3.757n ± 1% 3.716n ± 1% ~ (p=0.034 n=20) NormFloat64-32 3.837n ± 1% 3.718n ± 1% -3.09% (p=0.000 n=20) Perm3-32 35.23n ± 2% 34.11n ± 2% -3.19% (p=0.000 n=20) Perm30-32 208.8n ± 1% 200.6n ± 0% -3.93% (p=0.000 n=20) Perm30ViaShuffle-32 111.7n ± 1% 109.7n ± 1% -1.84% (p=0.000 n=20) ShuffleOverhead-32 101.1n ± 1% 107.2n ± 1% +6.03% (p=0.000 n=20) Concurrent-32 2.108n ± 7% 2.108n ± 6% ~ (p=0.644 n=20) PCG_DXSM-32 1.488n ± 2% goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 220860f76f.arm64 │ afa459a2f0.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.316n ± 1% 2.531n ± 0% +9.33% (p=0.000 n=20) GlobalInt64-8 2.183n ± 1% 2.177n ± 1% ~ (p=0.533 n=20) GlobalInt63Parallel-8 0.4331n ± 0% GlobalInt64Parallel-8 0.4319n ± 0% GlobalUint64-8 4.377n ± 2% 2.185n ± 1% -50.07% (p=0.000 n=20) GlobalUint64Parallel-8 0.9237n ± 0% 0.4295n ± 1% -53.50% (p=0.000 n=20) Int64-8 2.538n ± 1% 4.104n ± 0% +61.68% (p=0.000 n=20) Uint64-8 2.604n ± 1% 4.080n ± 0% +56.68% (p=0.000 n=20) GlobalIntN1000-8 3.857n ± 2% 2.814n ± 1% -27.04% (p=0.000 n=20) IntN1000-8 3.822n ± 2% 4.140n ± 0% +8.32% (p=0.000 n=20) Int64N1000-8 3.318n ± 0% 4.139n ± 0% +24.74% (p=0.000 n=20) Int64N1e8-8 3.349n ± 1% 4.140n ± 0% +23.64% (p=0.000 n=20) Int64N1e9-8 3.317n ± 2% 4.139n ± 0% +24.80% (p=0.000 n=20) Int64N2e9-8 3.317n ± 2% 4.140n ± 0% +24.81% (p=0.000 n=20) Int64N1e18-8 3.542n ± 1% 5.273n ± 0% +48.85% (p=0.000 n=20) Int64N2e18-8 5.087n ± 0% 6.059n ± 0% +19.12% (p=0.000 n=20) Int64N4e18-8 5.084n ± 0% 8.803n ± 0% +73.16% (p=0.000 n=20) Int32N1000-8 3.208n ± 2% 4.131n ± 0% +28.79% (p=0.000 n=20) Int32N1e8-8 3.610n ± 1% 4.131n ± 0% +14.43% (p=0.000 n=20) Int32N1e9-8 4.235n ± 0% 4.131n ± 0% -2.44% (p=0.000 n=20) Int32N2e9-8 4.229n ± 1% 4.131n ± 0% -2.33% (p=0.000 n=20) Float32-8 3.468n ± 0% 4.110n ± 0% +18.50% (p=0.000 n=20) Float64-8 3.447n ± 0% 4.104n ± 0% +19.05% (p=0.000 n=20) ExpFloat64-8 4.567n ± 0% 5.338n ± 0% +16.86% (p=0.000 n=20) NormFloat64-8 4.821n ± 0% 5.731n ± 0% +18.89% (p=0.000 n=20) Perm3-8 28.89n ± 0% 26.62n ± 0% -7.84% (p=0.000 n=20) Perm30-8 175.7n ± 0% 194.6n ± 2% +10.76% (p=0.000 n=20) Perm30ViaShuffle-8 153.5n ± 0% 156.4n ± 0% +1.86% (p=0.000 n=20) ShuffleOverhead-8 119.8n ± 1% 125.8n ± 0% +4.97% (p=0.000 n=20) Concurrent-8 2.433n ± 3% 2.654n ± 6% +9.13% (p=0.001 n=20) PCG_DXSM-8 2.531n ± 0% goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.386 │ afa459a2f0.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.370n ± 1% 7.680n ± 1% +224.05% (p=0.000 n=20) GlobalInt64-32 3.569n ± 1% 3.474n ± 3% -2.66% (p=0.001 n=20) GlobalInt63Parallel-32 0.3221n ± 1% GlobalInt64Parallel-32 0.3253n ± 0% GlobalUint64-32 8.797n ± 10% 3.433n ± 2% -60.98% (p=0.000 n=20) GlobalUint64Parallel-32 0.6351n ± 0% 0.3156n ± 0% -50.31% (p=0.000 n=20) Int64-32 2.612n ± 2% 7.707n ± 1% +195.04% (p=0.000 n=20) Uint64-32 3.350n ± 1% 7.714n ± 1% +130.25% (p=0.000 n=20) GlobalIntN1000-32 5.892n ± 1% 6.236n ± 1% +5.82% (p=0.000 n=20) IntN1000-32 4.546n ± 1% 10.410n ± 1% +128.97% (p=0.000 n=20) Int64N1000-32 14.59n ± 1% 10.97n ± 2% -24.75% (p=0.000 n=20) Int64N1e8-32 14.76n ± 2% 10.98n ± 1% -25.58% (p=0.000 n=20) Int64N1e9-32 16.57n ± 1% 10.95n ± 0% -33.90% (p=0.000 n=20) Int64N2e9-32 14.54n ± 1% 11.11n ± 1% -23.62% (p=0.000 n=20) Int64N1e18-32 16.14n ± 1% 15.18n ± 2% -5.95% (p=0.000 n=20) Int64N2e18-32 18.10n ± 1% 15.61n ± 1% -13.73% (p=0.000 n=20) Int64N4e18-32 18.65n ± 1% 19.23n ± 2% +3.08% (p=0.000 n=20) Int32N1000-32 3.560n ± 1% 10.345n ± 1% +190.55% (p=0.000 n=20) Int32N1e8-32 3.770n ± 2% 10.330n ± 1% +174.01% (p=0.000 n=20) Int32N1e9-32 4.098n ± 0% 10.350n ± 1% +152.53% (p=0.000 n=20) Int32N2e9-32 4.179n ± 1% 10.345n ± 1% +147.52% (p=0.000 n=20) Float32-32 21.18n ± 4% 13.57n ± 1% -35.93% (p=0.000 n=20) Float64-32 20.60n ± 2% 22.95n ± 4% +11.41% (p=0.000 n=20) ExpFloat64-32 13.07n ± 0% 15.23n ± 2% +16.48% (p=0.000 n=20) NormFloat64-32 7.738n ± 2% 13.780n ± 1% +78.08% (p=0.000 n=20) Perm3-32 36.73n ± 1% 46.62n ± 2% +26.91% (p=0.000 n=20) Perm30-32 211.9n ± 1% 400.7n ± 1% +89.05% (p=0.000 n=20) Perm30ViaShuffle-32 165.2n ± 1% 350.5n ± 1% +112.20% (p=0.000 n=20) ShuffleOverhead-32 133.9n ± 1% 326.0n ± 2% +143.37% (p=0.000 n=20) Concurrent-32 3.287n ± 2% 3.290n ± 0% ~ (p=0.365 n=20) PCG_DXSM-32 7.793n ± 2% For #61716. Change-Id: I4e9c0525b5f84a2ac46f23da9e365495e2d05777 Reviewed-on: https://go-review.googlesource.com/c/go/+/502506 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2023-10-30 17:09:26 +00:00
Russ Cox	8631fcbf31	math/rand/v2: add PCG-DXSM For the original math/rand, we ported Plan 9's random number generator, which was a refinement by Ken Thompson of an algorithm by Don Mitchell and Jim Reeds, which Mitchell in turn recalls as having been derived from an algorithm by Marsaglia. At its core, it is an additive lagged Fibonacci generator (ALFG). Whatever the details of the history, this generator is nowhere near the current state of the art for simple, pseudo-random generators. This CL adds an implementation of Melissa O'Neill's PCG, specifically the variant PCG-DXSM, which she defined after writing the PCG paper and which is now the default in Numpy. The update is slightly slower (a few multiplies and adds, instead of a few adds), but the state is dramatically smaller (2 words instead of 607). The statistical output properties are better too. A followup CL will delete the old generator. PCG is the only change here, so no benchmarks should be affected. Including them anyway as further evidence for caution. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 8993506f2f.amd64 │ 01ff938549.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.325n ± 1% 1.352n ± 1% +2.00% (p=0.000 n=20) GlobalInt64-32 2.240n ± 1% 2.083n ± 0% -7.03% (p=0.000 n=20) GlobalInt64Parallel-32 0.1041n ± 1% 0.1035n ± 1% ~ (p=0.064 n=20) GlobalUint64-32 2.072n ± 3% 2.038n ± 1% ~ (p=0.089 n=20) GlobalUint64Parallel-32 0.1008n ± 1% 0.1006n ± 1% ~ (p=0.804 n=20) Int64-32 1.716n ± 1% 1.687n ± 2% ~ (p=0.045 n=20) Uint64-32 1.665n ± 1% 1.674n ± 2% ~ (p=0.878 n=20) GlobalIntN1000-32 3.335n ± 1% 3.135n ± 1% -6.00% (p=0.000 n=20) IntN1000-32 2.484n ± 1% 2.478n ± 1% ~ (p=0.085 n=20) Int64N1000-32 2.502n ± 2% 2.455n ± 1% -1.88% (p=0.002 n=20) Int64N1e8-32 2.484n ± 2% 2.467n ± 2% ~ (p=0.048 n=20) Int64N1e9-32 2.502n ± 0% 2.454n ± 1% -1.92% (p=0.000 n=20) Int64N2e9-32 2.502n ± 0% 2.482n ± 1% -0.76% (p=0.000 n=20) Int64N1e18-32 3.201n ± 1% 3.349n ± 2% +4.62% (p=0.000 n=20) Int64N2e18-32 3.504n ± 1% 3.537n ± 1% ~ (p=0.185 n=20) Int64N4e18-32 4.873n ± 1% 4.917n ± 0% +0.90% (p=0.000 n=20) Int32N1000-32 2.639n ± 1% 2.386n ± 1% -9.57% (p=0.000 n=20) Int32N1e8-32 2.686n ± 2% 2.366n ± 1% -11.91% (p=0.000 n=20) Int32N1e9-32 2.636n ± 1% 2.355n ± 2% -10.70% (p=0.000 n=20) Int32N2e9-32 2.660n ± 1% 2.371n ± 1% -10.88% (p=0.000 n=20) Float32-32 2.261n ± 1% 2.245n ± 2% ~ (p=0.752 n=20) Float64-32 2.280n ± 1% 2.235n ± 1% -1.97% (p=0.007 n=20) ExpFloat64-32 3.891n ± 1% 3.813n ± 3% ~ (p=0.087 n=20) NormFloat64-32 3.711n ± 1% 3.652n ± 2% ~ (p=0.021 n=20) Perm3-32 32.60n ± 2% 33.12n ± 3% ~ (p=0.107 n=20) Perm30-32 204.2n ± 0% 205.1n ± 1% ~ (p=0.358 n=20) Perm30ViaShuffle-32 121.7n ± 2% 110.8n ± 1% -8.96% (p=0.000 n=20) ShuffleOverhead-32 106.2n ± 2% 113.0n ± 1% +6.36% (p=0.000 n=20) Concurrent-32 2.190n ± 5% 2.100n ± 0% -4.13% (p=0.001 n=20) PCG_DXSM-32 1.490n ± 0% goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 8993506f2f.arm64 │ 01ff938549.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.271n ± 0% 2.258n ± 1% ~ (p=0.167 n=20) GlobalInt64-8 2.161n ± 1% 2.167n ± 0% ~ (p=0.693 n=20) GlobalInt64Parallel-8 0.4303n ± 0% 0.4310n ± 0% ~ (p=0.051 n=20) GlobalUint64-8 2.164n ± 1% 2.182n ± 1% ~ (p=0.042 n=20) GlobalUint64Parallel-8 0.4287n ± 0% 0.4297n ± 0% ~ (p=0.082 n=20) Int64-8 2.478n ± 1% 2.472n ± 1% ~ (p=0.151 n=20) Uint64-8 2.460n ± 1% 2.449n ± 1% ~ (p=0.013 n=20) GlobalIntN1000-8 2.814n ± 2% 2.814n ± 2% ~ (p=0.821 n=20) IntN1000-8 3.003n ± 2% 2.998n ± 2% ~ (p=0.024 n=20) Int64N1000-8 2.954n ± 0% 2.949n ± 2% ~ (p=0.192 n=20) Int64N1e8-8 2.956n ± 0% 2.953n ± 2% ~ (p=0.109 n=20) Int64N1e9-8 3.325n ± 0% 2.950n ± 0% -11.26% (p=0.000 n=20) Int64N2e9-8 2.956n ± 2% 2.946n ± 2% ~ (p=0.027 n=20) Int64N1e18-8 3.780n ± 1% 3.779n ± 1% ~ (p=0.815 n=20) Int64N2e18-8 4.385n ± 0% 4.370n ± 1% ~ (p=0.402 n=20) Int64N4e18-8 6.527n ± 0% 6.544n ± 1% ~ (p=0.140 n=20) Int32N1000-8 2.964n ± 1% 2.950n ± 0% -0.47% (p=0.002 n=20) Int32N1e8-8 2.964n ± 1% 2.950n ± 2% ~ (p=0.013 n=20) Int32N1e9-8 2.963n ± 2% 2.951n ± 2% ~ (p=0.062 n=20) Int32N2e9-8 2.961n ± 2% 2.950n ± 2% -0.37% (p=0.002 n=20) Float32-8 3.442n ± 0% 3.441n ± 0% ~ (p=0.211 n=20) Float64-8 3.442n ± 0% 3.442n ± 0% ~ (p=0.067 n=20) ExpFloat64-8 4.472n ± 0% 4.481n ± 0% +0.20% (p=0.000 n=20) NormFloat64-8 4.734n ± 0% 4.725n ± 0% -0.19% (p=0.003 n=20) Perm3-8 26.55n ± 0% 26.55n ± 0% ~ (p=0.833 n=20) Perm30-8 181.9n ± 0% 181.9n ± 0% -0.03% (p=0.004 n=20) Perm30ViaShuffle-8 143.1n ± 0% 142.9n ± 0% ~ (p=0.204 n=20) ShuffleOverhead-8 120.6n ± 1% 120.8n ± 2% ~ (p=0.102 n=20) Concurrent-8 2.357n ± 2% 2.421n ± 6% ~ (p=0.016 n=20) PCG_DXSM-8 2.531n ± 0% goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 8993506f2f.386 │ 01ff938549.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.102n ± 2% 2.069n ± 0% ~ (p=0.021 n=20) GlobalInt64-32 3.542n ± 2% 3.456n ± 1% -2.44% (p=0.001 n=20) GlobalInt64Parallel-32 0.3202n ± 0% 0.3252n ± 0% +1.56% (p=0.000 n=20) GlobalUint64-32 3.507n ± 1% 3.573n ± 1% +1.87% (p=0.000 n=20) GlobalUint64Parallel-32 0.3170n ± 1% 0.3159n ± 0% ~ (p=0.167 n=20) Int64-32 2.516n ± 1% 2.562n ± 2% ~ (p=0.016 n=20) Uint64-32 2.544n ± 1% 2.592n ± 0% +1.85% (p=0.000 n=20) GlobalIntN1000-32 6.237n ± 1% 6.266n ± 2% ~ (p=0.268 n=20) IntN1000-32 4.670n ± 2% 4.724n ± 2% ~ (p=0.644 n=20) Int64N1000-32 5.412n ± 1% 5.490n ± 2% ~ (p=0.159 n=20) Int64N1e8-32 5.414n ± 2% 5.513n ± 2% ~ (p=0.129 n=20) Int64N1e9-32 5.473n ± 1% 5.476n ± 1% ~ (p=0.723 n=20) Int64N2e9-32 5.487n ± 1% 5.501n ± 2% ~ (p=0.481 n=20) Int64N1e18-32 8.901n ± 2% 9.043n ± 2% ~ (p=0.330 n=20) Int64N2e18-32 9.521n ± 1% 9.601n ± 2% ~ (p=0.703 n=20) Int64N4e18-32 11.92n ± 1% 12.00n ± 1% ~ (p=0.489 n=20) Int32N1000-32 4.785n ± 1% 4.829n ± 2% ~ (p=0.402 n=20) Int32N1e8-32 4.748n ± 1% 4.825n ± 2% ~ (p=0.218 n=20) Int32N1e9-32 4.810n ± 1% 4.830n ± 2% ~ (p=0.794 n=20) Int32N2e9-32 4.812n ± 1% 4.750n ± 2% ~ (p=0.057 n=20) Float32-32 10.48n ± 4% 10.89n ± 4% ~ (p=0.162 n=20) Float64-32 19.79n ± 3% 19.60n ± 4% ~ (p=0.668 n=20) ExpFloat64-32 12.91n ± 3% 12.96n ± 3% ~ (p=1.000 n=20) NormFloat64-32 7.462n ± 1% 7.516n ± 1% ~ (p=0.051 n=20) Perm3-32 35.98n ± 2% 36.78n ± 2% ~ (p=0.033 n=20) Perm30-32 241.5n ± 1% 238.9n ± 2% ~ (p=0.126 n=20) Perm30ViaShuffle-32 187.3n ± 2% 189.7n ± 2% ~ (p=0.387 n=20) ShuffleOverhead-32 160.2n ± 1% 159.8n ± 1% ~ (p=0.256 n=20) Concurrent-32 3.308n ± 3% 3.286n ± 1% ~ (p=0.038 n=20) PCG_DXSM-32 7.613n ± 1% For #61716. Change-Id: Icb274ca1f782504d658305a40159b4ae6a2f3f1d Reviewed-on: https://go-review.googlesource.com/c/go/+/502505 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Rob Pike <r@golang.org>	2023-10-30 17:09:23 +00:00
Russ Cox	f2e2637227	math/rand/v2: simplify Perm The compiler says Perm is being inlined into BenchmarkPerm, and yet BenchmarkPerm30ViaShuffle, which you'd think is the same code, still runs significantly faster. The benchmarks are mystifying but this is clearly still a step in the right direction, since BenchmarkPerm30ViaShuffle is still the fastest and we avoid having two copies of that logic. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ e1bbe739fb.amd64 │ 8993506f2f.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.316n ± 2% 1.325n ± 1% ~ (p=0.208 n=20) GlobalInt64-32 2.048n ± 1% 2.240n ± 1% +9.38% (p=0.000 n=20) GlobalInt64Parallel-32 0.1037n ± 1% 0.1041n ± 1% ~ (p=0.774 n=20) GlobalUint64-32 2.039n ± 2% 2.072n ± 3% ~ (p=0.115 n=20) GlobalUint64Parallel-32 0.1013n ± 1% 0.1008n ± 1% ~ (p=0.417 n=20) Int64-32 1.692n ± 2% 1.716n ± 1% ~ (p=0.122 n=20) Uint64-32 1.643n ± 2% 1.665n ± 1% ~ (p=0.062 n=20) GlobalIntN1000-32 3.287n ± 1% 3.335n ± 1% ~ (p=0.147 n=20) IntN1000-32 2.678n ± 2% 2.484n ± 1% -7.24% (p=0.000 n=20) Int64N1000-32 2.684n ± 2% 2.502n ± 2% -6.80% (p=0.000 n=20) Int64N1e8-32 2.663n ± 2% 2.484n ± 2% -6.76% (p=0.000 n=20) Int64N1e9-32 2.633n ± 1% 2.502n ± 0% -4.98% (p=0.000 n=20) Int64N2e9-32 2.657n ± 1% 2.502n ± 0% -5.87% (p=0.000 n=20) Int64N1e18-32 3.125n ± 2% 3.201n ± 1% +2.43% (p=0.000 n=20) Int64N2e18-32 3.476n ± 1% 3.504n ± 1% +0.83% (p=0.009 n=20) Int64N4e18-32 4.795n ± 1% 4.873n ± 1% ~ (p=0.106 n=20) Int32N1000-32 2.485n ± 2% 2.639n ± 1% +6.20% (p=0.000 n=20) Int32N1e8-32 2.457n ± 1% 2.686n ± 2% +9.34% (p=0.000 n=20) Int32N1e9-32 2.452n ± 1% 2.636n ± 1% +7.52% (p=0.000 n=20) Int32N2e9-32 2.453n ± 1% 2.660n ± 1% +8.44% (p=0.000 n=20) Float32-32 2.254n ± 1% 2.261n ± 1% ~ (p=0.888 n=20) Float64-32 2.262n ± 1% 2.280n ± 1% ~ (p=0.040 n=20) ExpFloat64-32 3.777n ± 2% 3.891n ± 1% +3.03% (p=0.000 n=20) NormFloat64-32 3.606n ± 1% 3.711n ± 1% +2.91% (p=0.000 n=20) Perm3-32 33.12n ± 2% 32.60n ± 2% ~ (p=0.045 n=20) Perm30-32 176.1n ± 1% 204.2n ± 0% +15.96% (p=0.000 n=20) Perm30ViaShuffle-32 109.3n ± 1% 121.7n ± 2% +11.30% (p=0.000 n=20) ShuffleOverhead-32 112.5n ± 1% 106.2n ± 2% -5.56% (p=0.000 n=20) Concurrent-32 2.099n ± 0% 2.190n ± 5% +4.36% (p=0.001 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ e1bbe739fb.arm64 │ 8993506f2f.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.290n ± 1% 2.271n ± 0% ~ (p=0.015 n=20) GlobalInt64-8 2.180n ± 1% 2.161n ± 1% ~ (p=0.180 n=20) GlobalInt64Parallel-8 0.4294n ± 0% 0.4303n ± 0% +0.19% (p=0.001 n=20) GlobalUint64-8 2.170n ± 1% 2.164n ± 1% ~ (p=0.673 n=20) GlobalUint64Parallel-8 0.4283n ± 0% 0.4287n ± 0% ~ (p=0.128 n=20) Int64-8 2.481n ± 1% 2.478n ± 1% ~ (p=0.867 n=20) Uint64-8 2.464n ± 1% 2.460n ± 1% ~ (p=0.763 n=20) GlobalIntN1000-8 2.814n ± 0% 2.814n ± 2% ~ (p=0.969 n=20) IntN1000-8 2.934n ± 2% 3.003n ± 2% +2.35% (p=0.000 n=20) Int64N1000-8 2.957n ± 1% 2.954n ± 0% ~ (p=0.285 n=20) Int64N1e8-8 2.935n ± 2% 2.956n ± 0% +0.73% (p=0.002 n=20) Int64N1e9-8 2.935n ± 2% 3.325n ± 0% +13.29% (p=0.000 n=20) Int64N2e9-8 2.933n ± 4% 2.956n ± 2% ~ (p=0.163 n=20) Int64N1e18-8 3.781n ± 1% 3.780n ± 1% ~ (p=0.805 n=20) Int64N2e18-8 4.362n ± 0% 4.385n ± 0% ~ (p=0.077 n=20) Int64N4e18-8 6.576n ± 1% 6.527n ± 0% ~ (p=0.024 n=20) Int32N1000-8 2.942n ± 2% 2.964n ± 1% ~ (p=0.073 n=20) Int32N1e8-8 2.941n ± 1% 2.964n ± 1% ~ (p=0.058 n=20) Int32N1e9-8 2.938n ± 2% 2.963n ± 2% +0.87% (p=0.003 n=20) Int32N2e9-8 2.982n ± 2% 2.961n ± 2% ~ (p=0.056 n=20) Float32-8 3.441n ± 0% 3.442n ± 0% ~ (p=0.030 n=20) Float64-8 3.441n ± 0% 3.442n ± 0% +0.03% (p=0.001 n=20) ExpFloat64-8 4.472n ± 0% 4.472n ± 0% ~ (p=0.877 n=20) NormFloat64-8 4.716n ± 0% 4.734n ± 0% +0.38% (p=0.000 n=20) Perm3-8 26.66n ± 0% 26.55n ± 0% -0.39% (p=0.000 n=20) Perm30-8 143.3n ± 0% 181.9n ± 0% +26.97% (p=0.000 n=20) Perm30ViaShuffle-8 142.9n ± 0% 143.1n ± 0% ~ (p=0.669 n=20) ShuffleOverhead-8 121.1n ± 1% 120.6n ± 1% -0.41% (p=0.004 n=20) Concurrent-8 2.379n ± 2% 2.357n ± 2% ~ (p=0.337 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ e1bbe739fb.386 │ 8993506f2f.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.087n ± 1% 2.102n ± 2% ~ (p=0.507 n=20) GlobalInt64-32 3.538n ± 2% 3.542n ± 2% ~ (p=0.425 n=20) GlobalInt64Parallel-32 0.3207n ± 1% 0.3202n ± 0% ~ (p=0.963 n=20) GlobalUint64-32 3.543n ± 1% 3.507n ± 1% ~ (p=0.034 n=20) GlobalUint64Parallel-32 0.3170n ± 0% 0.3170n ± 1% ~ (p=0.920 n=20) Int64-32 2.548n ± 1% 2.516n ± 1% ~ (p=0.139 n=20) Uint64-32 2.565n ± 2% 2.544n ± 1% ~ (p=0.394 n=20) GlobalIntN1000-32 6.300n ± 1% 6.237n ± 1% ~ (p=0.029 n=20) IntN1000-32 4.750n ± 0% 4.670n ± 2% ~ (p=0.034 n=20) Int64N1000-32 5.515n ± 2% 5.412n ± 1% -1.86% (p=0.009 n=20) Int64N1e8-32 5.527n ± 0% 5.414n ± 2% -2.05% (p=0.002 n=20) Int64N1e9-32 5.531n ± 2% 5.473n ± 1% ~ (p=0.047 n=20) Int64N2e9-32 5.514n ± 2% 5.487n ± 1% ~ (p=0.298 n=20) Int64N1e18-32 9.059n ± 1% 8.901n ± 2% ~ (p=0.037 n=20) Int64N2e18-32 9.594n ± 1% 9.521n ± 1% ~ (p=0.051 n=20) Int64N4e18-32 12.05n ± 2% 11.92n ± 1% ~ (p=0.357 n=20) Int32N1000-32 4.840n ± 2% 4.785n ± 1% ~ (p=0.189 n=20) Int32N1e8-32 4.832n ± 2% 4.748n ± 1% ~ (p=0.042 n=20) Int32N1e9-32 4.815n ± 2% 4.810n ± 1% ~ (p=0.878 n=20) Int32N2e9-32 4.813n ± 1% 4.812n ± 1% ~ (p=0.542 n=20) Float32-32 10.90n ± 2% 10.48n ± 4% -3.85% (p=0.007 n=20) Float64-32 20.32n ± 4% 19.79n ± 3% ~ (p=0.553 n=20) ExpFloat64-32 12.95n ± 3% 12.91n ± 3% ~ (p=0.909 n=20) NormFloat64-32 7.570n ± 1% 7.462n ± 1% -1.44% (p=0.004 n=20) Perm3-32 37.80n ± 2% 35.98n ± 2% -4.79% (p=0.000 n=20) Perm30-32 214.0n ± 1% 241.5n ± 1% +12.85% (p=0.000 n=20) Perm30ViaShuffle-32 188.7n ± 2% 187.3n ± 2% ~ (p=0.029 n=20) ShuffleOverhead-32 160.8n ± 1% 160.2n ± 1% ~ (p=0.180 n=20) Concurrent-32 3.288n ± 0% 3.308n ± 3% ~ (p=0.037 n=20) For #61716. Change-Id: I342b611456c3569520d3c91c849d29eba325d87e Reviewed-on: https://go-review.googlesource.com/c/go/+/502504 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Rob Pike <r@golang.org>	2023-10-30 17:09:21 +00:00
Branden Brown	488e2a56b9	math/rand/v2: remove bias in ExpFloat64 and NormFloat64 The original implementation of the ziggurat algorithm was designed for 32-bit random integer inputs. This necessitated reusing some low-order bits for the slice selection and the random coordinate, which introduces statistical bias. The result is that PractRand consistently fails the math/rand normal and exponential sequences (transformed to uniform) within 2 GB of variates. This change adjusts the ziggurat procedures to use 63-bit random inputs, so that there is no need to reuse bits between the slice and coordinate. This is sufficient for the normal sequence to survive to 256 GB of PractRand testing. An alternative technique is to recalculate the ziggurats to use 1024 rather than 128 or 256 slices to make full use of 64-bit inputs. This improves the survival of the normal sequence to far beyond 256 GB and additionally provides a 6% performance improvement due to the improved rejection procedure efficiency. However, doing so increases the total size of the ziggurat tables from 4.5 kB to 48 kB. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 2703446c2e.amd64 │ e1bbe739fb.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.337n ± 1% 1.316n ± 2% ~ (p=0.024 n=20) GlobalInt64-32 2.225n ± 2% 2.048n ± 1% -7.93% (p=0.000 n=20) GlobalInt64Parallel-32 0.1043n ± 2% 0.1037n ± 1% ~ (p=0.587 n=20) GlobalUint64-32 2.058n ± 1% 2.039n ± 2% ~ (p=0.030 n=20) GlobalUint64Parallel-32 0.1009n ± 1% 0.1013n ± 1% ~ (p=0.984 n=20) Int64-32 1.719n ± 2% 1.692n ± 2% ~ (p=0.085 n=20) Uint64-32 1.669n ± 1% 1.643n ± 2% ~ (p=0.049 n=20) GlobalIntN1000-32 3.321n ± 2% 3.287n ± 1% ~ (p=0.298 n=20) IntN1000-32 2.479n ± 1% 2.678n ± 2% +8.01% (p=0.000 n=20) Int64N1000-32 2.477n ± 1% 2.684n ± 2% +8.38% (p=0.000 n=20) Int64N1e8-32 2.490n ± 1% 2.663n ± 2% +6.99% (p=0.000 n=20) Int64N1e9-32 2.458n ± 1% 2.633n ± 1% +7.12% (p=0.000 n=20) Int64N2e9-32 2.486n ± 2% 2.657n ± 1% +6.90% (p=0.000 n=20) Int64N1e18-32 3.215n ± 2% 3.125n ± 2% -2.78% (p=0.000 n=20) Int64N2e18-32 3.588n ± 2% 3.476n ± 1% -3.15% (p=0.000 n=20) Int64N4e18-32 4.938n ± 2% 4.795n ± 1% -2.91% (p=0.000 n=20) Int32N1000-32 2.673n ± 2% 2.485n ± 2% -7.02% (p=0.000 n=20) Int32N1e8-32 2.631n ± 2% 2.457n ± 1% -6.63% (p=0.000 n=20) Int32N1e9-32 2.628n ± 2% 2.452n ± 1% -6.70% (p=0.000 n=20) Int32N2e9-32 2.684n ± 2% 2.453n ± 1% -8.61% (p=0.000 n=20) Float32-32 2.240n ± 2% 2.254n ± 1% ~ (p=0.878 n=20) Float64-32 2.253n ± 1% 2.262n ± 1% ~ (p=0.963 n=20) ExpFloat64-32 3.677n ± 1% 3.777n ± 2% +2.71% (p=0.004 n=20) NormFloat64-32 3.761n ± 1% 3.606n ± 1% -4.15% (p=0.000 n=20) Perm3-32 33.55n ± 2% 33.12n ± 2% ~ (p=0.402 n=20) Perm30-32 173.2n ± 1% 176.1n ± 1% +1.67% (p=0.000 n=20) Perm30ViaShuffle-32 115.9n ± 1% 109.3n ± 1% -5.69% (p=0.000 n=20) ShuffleOverhead-32 101.9n ± 1% 112.5n ± 1% +10.35% (p=0.000 n=20) Concurrent-32 2.107n ± 6% 2.099n ± 0% ~ (p=0.051 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 2703446c2e.arm64 │ e1bbe739fb.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.275n ± 0% 2.290n ± 1% ~ (p=0.044 n=20) GlobalInt64-8 2.154n ± 1% 2.180n ± 1% ~ (p=0.068 n=20) GlobalInt64Parallel-8 0.4298n ± 0% 0.4294n ± 0% ~ (p=0.079 n=20) GlobalUint64-8 2.160n ± 1% 2.170n ± 1% ~ (p=0.129 n=20) GlobalUint64Parallel-8 0.4286n ± 0% 0.4283n ± 0% ~ (p=0.350 n=20) Int64-8 2.491n ± 1% 2.481n ± 1% ~ (p=0.330 n=20) Uint64-8 2.458n ± 0% 2.464n ± 1% ~ (p=0.351 n=20) GlobalIntN1000-8 2.814n ± 2% 2.814n ± 0% ~ (p=0.325 n=20) IntN1000-8 2.933n ± 0% 2.934n ± 2% ~ (p=0.079 n=20) Int64N1000-8 2.962n ± 1% 2.957n ± 1% ~ (p=0.259 n=20) Int64N1e8-8 2.960n ± 1% 2.935n ± 2% ~ (p=0.276 n=20) Int64N1e9-8 2.935n ± 2% 2.935n ± 2% ~ (p=0.984 n=20) Int64N2e9-8 2.934n ± 0% 2.933n ± 4% ~ (p=0.463 n=20) Int64N1e18-8 3.777n ± 1% 3.781n ± 1% ~ (p=0.516 n=20) Int64N2e18-8 4.359n ± 1% 4.362n ± 0% ~ (p=0.256 n=20) Int64N4e18-8 6.536n ± 1% 6.576n ± 1% ~ (p=0.224 n=20) Int32N1000-8 2.937n ± 0% 2.942n ± 2% ~ (p=0.312 n=20) Int32N1e8-8 2.937n ± 1% 2.941n ± 1% ~ (p=0.463 n=20) Int32N1e9-8 2.936n ± 0% 2.938n ± 2% ~ (p=0.044 n=20) Int32N2e9-8 2.938n ± 2% 2.982n ± 2% ~ (p=0.174 n=20) Float32-8 3.441n ± 0% 3.441n ± 0% ~ (p=0.064 n=20) Float64-8 3.441n ± 0% 3.441n ± 0% ~ (p=0.826 n=20) ExpFloat64-8 4.486n ± 0% 4.472n ± 0% -0.31% (p=0.000 n=20) NormFloat64-8 4.721n ± 0% 4.716n ± 0% ~ (p=0.051 n=20) Perm3-8 26.65n ± 0% 26.66n ± 0% ~ (p=0.080 n=20) Perm30-8 143.2n ± 0% 143.3n ± 0% +0.10% (p=0.000 n=20) Perm30ViaShuffle-8 143.0n ± 0% 142.9n ± 0% ~ (p=0.642 n=20) ShuffleOverhead-8 120.6n ± 1% 121.1n ± 1% +0.41% (p=0.010 n=20) Concurrent-8 2.399n ± 5% 2.379n ± 2% ~ (p=0.365 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 2703446c2e.386 │ e1bbe739fb.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.072n ± 2% 2.087n ± 1% ~ (p=0.440 n=20) GlobalInt64-32 3.546n ± 27% 3.538n ± 2% ~ (p=0.101 n=20) GlobalInt64Parallel-32 0.3211n ± 0% 0.3207n ± 1% ~ (p=0.753 n=20) GlobalUint64-32 3.522n ± 2% 3.543n ± 1% ~ (p=0.071 n=20) GlobalUint64Parallel-32 0.3172n ± 0% 0.3170n ± 0% ~ (p=0.507 n=20) Int64-32 2.520n ± 2% 2.548n ± 1% ~ (p=0.267 n=20) Uint64-32 2.581n ± 1% 2.565n ± 2% ~ (p=0.143 n=20) GlobalIntN1000-32 6.171n ± 1% 6.300n ± 1% ~ (p=0.037 n=20) IntN1000-32 4.752n ± 2% 4.750n ± 0% ~ (p=0.984 n=20) Int64N1000-32 5.429n ± 1% 5.515n ± 2% ~ (p=0.292 n=20) Int64N1e8-32 5.469n ± 2% 5.527n ± 0% ~ (p=0.013 n=20) Int64N1e9-32 5.489n ± 2% 5.531n ± 2% ~ (p=0.256 n=20) Int64N2e9-32 5.492n ± 2% 5.514n ± 2% ~ (p=0.606 n=20) Int64N1e18-32 8.927n ± 1% 9.059n ± 1% ~ (p=0.229 n=20) Int64N2e18-32 9.622n ± 1% 9.594n ± 1% ~ (p=0.703 n=20) Int64N4e18-32 12.03n ± 1% 12.05n ± 2% ~ (p=0.733 n=20) Int32N1000-32 4.817n ± 1% 4.840n ± 2% ~ (p=0.941 n=20) Int32N1e8-32 4.801n ± 1% 4.832n ± 2% ~ (p=0.228 n=20) Int32N1e9-32 4.798n ± 1% 4.815n ± 2% ~ (p=0.560 n=20) Int32N2e9-32 4.840n ± 1% 4.813n ± 1% ~ (p=0.015 n=20) Float32-32 10.51n ± 4% 10.90n ± 2% +3.71% (p=0.007 n=20) Float64-32 20.33n ± 3% 20.32n ± 4% ~ (p=0.566 n=20) ExpFloat64-32 12.59n ± 2% 12.95n ± 3% +2.86% (p=0.002 n=20) NormFloat64-32 7.350n ± 2% 7.570n ± 1% +2.99% (p=0.007 n=20) Perm3-32 39.29n ± 2% 37.80n ± 2% -3.79% (p=0.000 n=20) Perm30-32 219.1n ± 2% 214.0n ± 1% -2.33% (p=0.002 n=20) Perm30ViaShuffle-32 189.8n ± 2% 188.7n ± 2% ~ (p=0.147 n=20) ShuffleOverhead-32 158.9n ± 2% 160.8n ± 1% ~ (p=0.176 n=20) Concurrent-32 3.306n ± 3% 3.288n ± 0% -0.54% (p=0.005 n=20) For #61716. Change-Id: I4c5fe710b310dc075ae21c97d1805bcc20db5050 Reviewed-on: https://go-review.googlesource.com/c/go/+/516275 Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Rob Pike <r@golang.org>	2023-10-30 17:08:47 +00:00
Russ Cox	ecda959b99	math/rand/v2: optimize Float32, Float64 We realized too late after Go 1 that float64(r.Uint64())/(1<<64) is not a correct implementation: it occasionally rounds to 1. The correct implementation is float64(r.Uint64()&(1<<53-1))/(1<<53) but we couldn't change the implementation for compatibility, so we changed it to retry only in the "round to 1" cases. The change to v2 lets us update the algorithm to the simpler, faster one. Note that this implementation cannot generate 2⁻⁵⁴, nor 2⁻¹⁰⁰, nor any of the other numbers between 0 and 2⁻⁵³. A slower algorithm could shift some of the probability of generating these two boundary values over to the values in between, but that would be much slower and not necessarily be better. In particular, the current implementation has the property that there are uniform gaps between the possible returned floats, which might help stability. Also, the result is often scaled and shifted, like Float64()*X+Y. Multiplying by X>1 would open new gaps, and adding most Y would erase all the distinctions that were introduced. The only changes to benchmarks should be in Float32 and Float64. The other changes remain a cautionary tale. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 4d84a369d1.amd64 │ 2703446c2e.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.348n ± 2% 1.337n ± 1% ~ (p=0.662 n=20) GlobalInt64-32 2.082n ± 2% 2.225n ± 2% +6.87% (p=0.000 n=20) GlobalInt64Parallel-32 0.1036n ± 1% 0.1043n ± 2% ~ (p=0.171 n=20) GlobalUint64-32 2.077n ± 2% 2.058n ± 1% ~ (p=0.560 n=20) GlobalUint64Parallel-32 0.1012n ± 1% 0.1009n ± 1% ~ (p=0.995 n=20) Int64-32 1.750n ± 0% 1.719n ± 2% -1.74% (p=0.000 n=20) Uint64-32 1.707n ± 2% 1.669n ± 1% -2.20% (p=0.000 n=20) GlobalIntN1000-32 3.192n ± 1% 3.321n ± 2% +4.04% (p=0.000 n=20) IntN1000-32 2.462n ± 2% 2.479n ± 1% ~ (p=0.417 n=20) Int64N1000-32 2.470n ± 1% 2.477n ± 1% ~ (p=0.664 n=20) Int64N1e8-32 2.503n ± 2% 2.490n ± 1% ~ (p=0.245 n=20) Int64N1e9-32 2.487n ± 1% 2.458n ± 1% ~ (p=0.032 n=20) Int64N2e9-32 2.487n ± 1% 2.486n ± 2% ~ (p=0.507 n=20) Int64N1e18-32 3.006n ± 2% 3.215n ± 2% +6.94% (p=0.000 n=20) Int64N2e18-32 3.368n ± 1% 3.588n ± 2% +6.55% (p=0.000 n=20) Int64N4e18-32 4.763n ± 1% 4.938n ± 2% +3.69% (p=0.000 n=20) Int32N1000-32 2.403n ± 1% 2.673n ± 2% +11.19% (p=0.000 n=20) Int32N1e8-32 2.405n ± 1% 2.631n ± 2% +9.42% (p=0.000 n=20) Int32N1e9-32 2.402n ± 2% 2.628n ± 2% +9.41% (p=0.000 n=20) Int32N2e9-32 2.384n ± 1% 2.684n ± 2% +12.56% (p=0.000 n=20) Float32-32 2.641n ± 2% 2.240n ± 2% -15.18% (p=0.000 n=20) Float64-32 2.483n ± 1% 2.253n ± 1% -9.26% (p=0.000 n=20) ExpFloat64-32 3.486n ± 2% 3.677n ± 1% +5.49% (p=0.000 n=20) NormFloat64-32 3.648n ± 1% 3.761n ± 1% +3.11% (p=0.000 n=20) Perm3-32 33.04n ± 1% 33.55n ± 2% ~ (p=0.180 n=20) Perm30-32 171.9n ± 1% 173.2n ± 1% ~ (p=0.050 n=20) Perm30ViaShuffle-32 100.3n ± 1% 115.9n ± 1% +15.55% (p=0.000 n=20) ShuffleOverhead-32 102.5n ± 1% 101.9n ± 1% ~ (p=0.266 n=20) Concurrent-32 2.101n ± 0% 2.107n ± 6% ~ (p=0.212 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 4d84a369d1.arm64 │ 2703446c2e.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.261n ± 1% 2.275n ± 0% ~ (p=0.082 n=20) GlobalInt64-8 2.160n ± 1% 2.154n ± 1% ~ (p=0.490 n=20) GlobalInt64Parallel-8 0.4299n ± 0% 0.4298n ± 0% ~ (p=0.663 n=20) GlobalUint64-8 2.169n ± 1% 2.160n ± 1% ~ (p=0.292 n=20) GlobalUint64Parallel-8 0.4293n ± 1% 0.4286n ± 0% ~ (p=0.155 n=20) Int64-8 2.473n ± 1% 2.491n ± 1% ~ (p=0.317 n=20) Uint64-8 2.453n ± 1% 2.458n ± 0% ~ (p=0.941 n=20) GlobalIntN1000-8 2.814n ± 2% 2.814n ± 2% ~ (p=0.972 n=20) IntN1000-8 2.933n ± 2% 2.933n ± 0% ~ (p=0.287 n=20) Int64N1000-8 2.934n ± 2% 2.962n ± 1% ~ (p=0.062 n=20) Int64N1e8-8 2.935n ± 2% 2.960n ± 1% ~ (p=0.183 n=20) Int64N1e9-8 2.934n ± 2% 2.935n ± 2% ~ (p=0.367 n=20) Int64N2e9-8 2.935n ± 2% 2.934n ± 0% ~ (p=0.455 n=20) Int64N1e18-8 3.778n ± 1% 3.777n ± 1% ~ (p=0.995 n=20) Int64N2e18-8 4.359n ± 1% 4.359n ± 1% ~ (p=0.122 n=20) Int64N4e18-8 6.546n ± 1% 6.536n ± 1% ~ (p=0.920 n=20) Int32N1000-8 2.940n ± 2% 2.937n ± 0% ~ (p=0.149 n=20) Int32N1e8-8 2.937n ± 2% 2.937n ± 1% ~ (p=0.620 n=20) Int32N1e9-8 2.938n ± 0% 2.936n ± 0% ~ (p=0.046 n=20) Int32N2e9-8 2.938n ± 2% 2.938n ± 2% ~ (p=0.455 n=20) Float32-8 3.486n ± 0% 3.441n ± 0% -1.28% (p=0.000 n=20) Float64-8 3.480n ± 0% 3.441n ± 0% -1.13% (p=0.000 n=20) ExpFloat64-8 4.533n ± 0% 4.486n ± 0% -1.03% (p=0.000 n=20) NormFloat64-8 4.764n ± 0% 4.721n ± 0% -0.90% (p=0.000 n=20) Perm3-8 26.66n ± 0% 26.65n ± 0% ~ (p=0.019 n=20) Perm30-8 143.4n ± 0% 143.2n ± 0% -0.17% (p=0.000 n=20) Perm30ViaShuffle-8 142.9n ± 0% 143.0n ± 0% ~ (p=0.522 n=20) ShuffleOverhead-8 120.7n ± 0% 120.6n ± 1% ~ (p=0.488 n=20) Concurrent-8 2.360n ± 2% 2.399n ± 5% ~ (p=0.062 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 4d84a369d1.386 │ 2703446c2e.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.101n ± 2% 2.072n ± 2% ~ (p=0.273 n=20) GlobalInt64-32 3.518n ± 2% 3.546n ± 27% +0.78% (p=0.007 n=20) GlobalInt64Parallel-32 0.3206n ± 0% 0.3211n ± 0% ~ (p=0.386 n=20) GlobalUint64-32 3.538n ± 1% 3.522n ± 2% ~ (p=0.331 n=20) GlobalUint64Parallel-32 0.3231n ± 0% 0.3172n ± 0% -1.84% (p=0.000 n=20) Int64-32 2.554n ± 2% 2.520n ± 2% ~ (p=0.465 n=20) Uint64-32 2.575n ± 2% 2.581n ± 1% ~ (p=0.213 n=20) GlobalIntN1000-32 6.292n ± 1% 6.171n ± 1% ~ (p=0.015 n=20) IntN1000-32 4.735n ± 1% 4.752n ± 2% ~ (p=0.635 n=20) Int64N1000-32 5.489n ± 2% 5.429n ± 1% ~ (p=0.324 n=20) Int64N1e8-32 5.528n ± 2% 5.469n ± 2% ~ (p=0.013 n=20) Int64N1e9-32 5.438n ± 2% 5.489n ± 2% ~ (p=0.984 n=20) Int64N2e9-32 5.474n ± 1% 5.492n ± 2% ~ (p=0.616 n=20) Int64N1e18-32 9.053n ± 1% 8.927n ± 1% ~ (p=0.037 n=20) Int64N2e18-32 9.685n ± 2% 9.622n ± 1% ~ (p=0.449 n=20) Int64N4e18-32 12.18n ± 1% 12.03n ± 1% ~ (p=0.013 n=20) Int32N1000-32 4.862n ± 1% 4.817n ± 1% -0.94% (p=0.002 n=20) Int32N1e8-32 4.758n ± 2% 4.801n ± 1% ~ (p=0.597 n=20) Int32N1e9-32 4.772n ± 1% 4.798n ± 1% ~ (p=0.774 n=20) Int32N2e9-32 4.847n ± 0% 4.840n ± 1% ~ (p=0.867 n=20) Float32-32 22.18n ± 4% 10.51n ± 4% -52.61% (p=0.000 n=20) Float64-32 21.21n ± 3% 20.33n ± 3% -4.17% (p=0.000 n=20) ExpFloat64-32 12.39n ± 2% 12.59n ± 2% ~ (p=0.139 n=20) NormFloat64-32 7.422n ± 1% 7.350n ± 2% ~ (p=0.208 n=20) Perm3-32 38.00n ± 2% 39.29n ± 2% +3.38% (p=0.000 n=20) Perm30-32 212.7n ± 1% 219.1n ± 2% +3.03% (p=0.001 n=20) Perm30ViaShuffle-32 187.5n ± 2% 189.8n ± 2% ~ (p=0.457 n=20) ShuffleOverhead-32 159.7n ± 1% 158.9n ± 2% ~ (p=0.920 n=20) Concurrent-32 3.470n ± 0% 3.306n ± 3% -4.71% (p=0.000 n=20) For #61716. Change-Id: I1933f1f9efd7e6e832d83e7fa5d84398f67d41f5 Reviewed-on: https://go-review.googlesource.com/c/go/+/502503 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2023-10-30 17:08:40 +00:00
Russ Cox	c266587846	math/rand/v2: add, optimize N, UintN, Uint32N, Uint64N Now that we can break the value stream, we can take advantage of better algorithms that have been suggested since the original code was written. Also optimizes IntN, Int32N, Int64N, Perm (indirectly). All the N variants (IntN, Int32N, Int64N, UintN, N, etc) now return the same values given a Source and parameter n, so that for example uint(r.IntN(10)) and r.UintN(10) and r.N(uint(10)) are completely interchangeable. Int64N4e18 gets slower but that is a near worst case for the algorithm and is extremely unlikely in practice. 32-bit Int32N variants got slower too, by 15-30%, in exchange for speeding up everything on 64-bit systems and consistency across the N functions. Also rename previously missed benchmark GlobalInt63Parallel to GlobalInt64Parallel. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 11ad9fdddc.amd64 │ 4d84a369d1.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.335n ± 1% 1.348n ± 2% ~ (p=0.335 n=20) GlobalInt64-32 2.046n ± 1% 2.082n ± 2% ~ (p=0.310 n=20) GlobalInt63Parallel-32 0.1037n ± 1% GlobalInt64Parallel-32 0.1036n ± 1% GlobalUint64-32 2.075n ± 0% 2.077n ± 2% ~ (p=0.228 n=20) GlobalUint64Parallel-32 0.1013n ± 1% 0.1012n ± 1% ~ (p=0.878 n=20) Int64-32 1.726n ± 2% 1.750n ± 0% +1.39% (p=0.000 n=20) Uint64-32 1.673n ± 1% 1.707n ± 2% +2.03% (p=0.002 n=20) GlobalIntN1000-32 3.895n ± 2% 3.192n ± 1% -18.05% (p=0.000 n=20) IntN1000-32 3.403n ± 1% 2.462n ± 2% -27.65% (p=0.000 n=20) Int64N1000-32 3.053n ± 2% 2.470n ± 1% -19.11% (p=0.000 n=20) Int64N1e8-32 2.718n ± 1% 2.503n ± 2% -7.91% (p=0.000 n=20) Int64N1e9-32 2.712n ± 1% 2.487n ± 1% -8.31% (p=0.000 n=20) Int64N2e9-32 2.690n ± 1% 2.487n ± 1% -7.57% (p=0.000 n=20) Int64N1e18-32 3.084n ± 2% 3.006n ± 2% -2.53% (p=0.000 n=20) Int64N2e18-32 4.026n ± 1% 3.368n ± 1% -16.33% (p=0.000 n=20) Int64N4e18-32 4.049n ± 2% 4.763n ± 1% +17.62% (p=0.000 n=20) Int32N1000-32 2.730n ± 0% 2.403n ± 1% -11.94% (p=0.000 n=20) Int32N1e8-32 2.916n ± 2% 2.405n ± 1% -17.53% (p=0.000 n=20) Int32N1e9-32 3.375n ± 1% 2.402n ± 2% -28.83% (p=0.000 n=20) Int32N2e9-32 3.292n ± 1% 2.384n ± 1% -27.58% (p=0.000 n=20) Float32-32 2.673n ± 1% 2.641n ± 2% ~ (p=0.147 n=20) Float64-32 2.485n ± 1% 2.483n ± 1% ~ (p=0.804 n=20) ExpFloat64-32 3.577n ± 2% 3.486n ± 2% -2.57% (p=0.000 n=20) NormFloat64-32 3.797n ± 2% 3.648n ± 1% -3.92% (p=0.000 n=20) Perm3-32 35.79n ± 2% 33.04n ± 1% -7.68% (p=0.000 n=20) Perm30-32 205.1n ± 1% 171.9n ± 1% -16.14% (p=0.000 n=20) Perm30ViaShuffle-32 111.2n ± 2% 100.3n ± 1% -9.76% (p=0.000 n=20) ShuffleOverhead-32 100.5n ± 2% 102.5n ± 1% +1.99% (p=0.007 n=20) Concurrent-32 2.188n ± 5% 2.101n ± 0% ~ (p=0.013 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 11ad9fdddc.arm64 │ 4d84a369d1.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.272n ± 1% 2.261n ± 1% ~ (p=0.172 n=20) GlobalInt64-8 2.155n ± 1% 2.160n ± 1% ~ (p=0.482 n=20) GlobalInt63Parallel-8 0.4352n ± 0% GlobalInt64Parallel-8 0.4299n ± 0% GlobalUint64-8 2.173n ± 1% 2.169n ± 1% ~ (p=0.262 n=20) GlobalUint64Parallel-8 0.4340n ± 0% 0.4293n ± 1% -1.08% (p=0.000 n=20) Int64-8 2.544n ± 1% 2.473n ± 1% -2.83% (p=0.000 n=20) Uint64-8 2.552n ± 1% 2.453n ± 1% -3.90% (p=0.000 n=20) GlobalIntN1000-8 3.856n ± 0% 2.814n ± 2% -27.02% (p=0.000 n=20) IntN1000-8 3.820n ± 0% 2.933n ± 2% -23.22% (p=0.000 n=20) Int64N1000-8 3.219n ± 2% 2.934n ± 2% -8.85% (p=0.000 n=20) Int64N1e8-8 3.221n ± 2% 2.935n ± 2% -8.91% (p=0.000 n=20) Int64N1e9-8 3.276n ± 2% 2.934n ± 2% -10.44% (p=0.000 n=20) Int64N2e9-8 3.217n ± 0% 2.935n ± 2% -8.78% (p=0.000 n=20) Int64N1e18-8 3.502n ± 2% 3.778n ± 1% +7.91% (p=0.000 n=20) Int64N2e18-8 4.968n ± 1% 4.359n ± 1% -12.26% (p=0.000 n=20) Int64N4e18-8 4.963n ± 0% 6.546n ± 1% +31.92% (p=0.000 n=20) Int32N1000-8 3.189n ± 1% 2.940n ± 2% -7.81% (p=0.000 n=20) Int32N1e8-8 3.514n ± 1% 2.937n ± 2% -16.41% (p=0.000 n=20) Int32N1e9-8 4.133n ± 0% 2.938n ± 0% -28.91% (p=0.000 n=20) Int32N2e9-8 4.137n ± 0% 2.938n ± 2% -28.97% (p=0.000 n=20) Float32-8 3.468n ± 1% 3.486n ± 0% +0.52% (p=0.000 n=20) Float64-8 3.478n ± 0% 3.480n ± 0% ~ (p=0.063 n=20) ExpFloat64-8 4.563n ± 0% 4.533n ± 0% -0.67% (p=0.000 n=20) NormFloat64-8 4.768n ± 0% 4.764n ± 0% -0.07% (p=0.001 n=20) Perm3-8 28.94n ± 0% 26.66n ± 0% -7.88% (p=0.000 n=20) Perm30-8 175.9n ± 0% 143.4n ± 0% -18.50% (p=0.000 n=20) Perm30ViaShuffle-8 152.6n ± 1% 142.9n ± 0% -6.29% (p=0.000 n=20) ShuffleOverhead-8 119.6n ± 1% 120.7n ± 0% +0.96% (p=0.000 n=20) Concurrent-8 2.452n ± 3% 2.360n ± 2% -3.73% (p=0.007 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 11ad9fdddc.386 │ 4d84a369d1.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.091n ± 1% 2.101n ± 2% ~ (p=0.672 n=20) GlobalInt64-32 3.514n ± 2% 3.518n ± 2% ~ (p=0.723 n=20) GlobalInt63Parallel-32 0.3197n ± 0% GlobalInt64Parallel-32 0.3206n ± 0% GlobalUint64-32 3.542n ± 1% 3.538n ± 1% ~ (p=0.304 n=20) GlobalUint64Parallel-32 0.3218n ± 0% 0.3231n ± 0% ~ (p=0.071 n=20) Int64-32 2.552n ± 2% 2.554n ± 2% ~ (p=0.693 n=20) Uint64-32 2.566n ± 1% 2.575n ± 2% ~ (p=0.606 n=20) GlobalIntN1000-32 5.965n ± 2% 6.292n ± 1% +5.46% (p=0.000 n=20) IntN1000-32 4.652n ± 1% 4.735n ± 1% +1.77% (p=0.000 n=20) Int64N1000-32 14.485n ± 1% 5.489n ± 2% -62.11% (p=0.000 n=20) Int64N1e8-32 14.675n ± 1% 5.528n ± 2% -62.33% (p=0.000 n=20) Int64N1e9-32 16.805n ± 2% 5.438n ± 2% -67.64% (p=0.000 n=20) Int64N2e9-32 14.515n ± 1% 5.474n ± 1% -62.28% (p=0.000 n=20) Int64N1e18-32 16.165n ± 1% 9.053n ± 1% -44.00% (p=0.000 n=20) Int64N2e18-32 17.945n ± 2% 9.685n ± 2% -46.03% (p=0.000 n=20) Int64N4e18-32 18.35n ± 2% 12.18n ± 1% -33.62% (p=0.000 n=20) Int32N1000-32 3.608n ± 1% 4.862n ± 1% +34.77% (p=0.000 n=20) Int32N1e8-32 3.767n ± 1% 4.758n ± 2% +26.31% (p=0.000 n=20) Int32N1e9-32 4.130n ± 2% 4.772n ± 1% +15.54% (p=0.000 n=20) Int32N2e9-32 4.206n ± 1% 4.847n ± 0% +15.24% (p=0.000 n=20) Float32-32 22.18n ± 4% 22.18n ± 4% ~ (p=0.195 n=20) Float64-32 20.75n ± 4% 21.21n ± 3% ~ (p=0.394 n=20) ExpFloat64-32 12.58n ± 3% 12.39n ± 2% ~ (p=0.032 n=20) NormFloat64-32 7.920n ± 3% 7.422n ± 1% -6.29% (p=0.000 n=20) Perm3-32 40.27n ± 1% 38.00n ± 2% -5.65% (p=0.000 n=20) Perm30-32 213.2n ± 2% 212.7n ± 1% ~ (p=0.995 n=20) Perm30ViaShuffle-32 164.2n ± 2% 187.5n ± 2% +14.22% (p=0.000 n=20) ShuffleOverhead-32 134.7n ± 2% 159.7n ± 1% +18.52% (p=0.000 n=20) Concurrent-32 3.301n ± 2% 3.470n ± 0% +5.10% (p=0.000 n=20) For #61716. Change-Id: Id1481b04202883cd0b23e21bb58d1bca4e482bd3 Reviewed-on: https://go-review.googlesource.com/c/go/+/502500 Reviewed-by: Rob Pike <r@golang.org> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2023-10-30 17:08:37 +00:00
Russ Cox	c7dddb02d3	math/rand/v2: change Source to use uint64 This should make Uint64-using functions faster and leave other things alone. It is a mystery why so much got faster. A good cautionary tale not to read too much into minor jitter in the benchmarks. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.amd64 │ 11ad9fdddc.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.555n ± 1% 1.335n ± 1% -14.15% (p=0.000 n=20) GlobalInt64-32 2.071n ± 1% 2.046n ± 1% ~ (p=0.016 n=20) GlobalInt63Parallel-32 0.1023n ± 1% 0.1037n ± 1% +1.37% (p=0.002 n=20) GlobalUint64-32 5.193n ± 1% 2.075n ± 0% -60.06% (p=0.000 n=20) GlobalUint64Parallel-32 0.2341n ± 0% 0.1013n ± 1% -56.74% (p=0.000 n=20) Int64-32 2.056n ± 2% 1.726n ± 2% -16.10% (p=0.000 n=20) Uint64-32 2.077n ± 2% 1.673n ± 1% -19.46% (p=0.000 n=20) GlobalIntN1000-32 4.077n ± 2% 3.895n ± 2% -4.45% (p=0.000 n=20) IntN1000-32 3.476n ± 2% 3.403n ± 1% -2.10% (p=0.000 n=20) Int64N1000-32 3.059n ± 1% 3.053n ± 2% ~ (p=0.131 n=20) Int64N1e8-32 2.942n ± 1% 2.718n ± 1% -7.60% (p=0.000 n=20) Int64N1e9-32 2.932n ± 1% 2.712n ± 1% -7.50% (p=0.000 n=20) Int64N2e9-32 2.925n ± 1% 2.690n ± 1% -8.03% (p=0.000 n=20) Int64N1e18-32 3.116n ± 1% 3.084n ± 2% ~ (p=0.425 n=20) Int64N2e18-32 4.067n ± 1% 4.026n ± 1% -1.02% (p=0.007 n=20) Int64N4e18-32 4.054n ± 1% 4.049n ± 2% ~ (p=0.204 n=20) Int32N1000-32 2.951n ± 1% 2.730n ± 0% -7.49% (p=0.000 n=20) Int32N1e8-32 3.102n ± 1% 2.916n ± 2% -6.03% (p=0.000 n=20) Int32N1e9-32 3.535n ± 1% 3.375n ± 1% -4.54% (p=0.000 n=20) Int32N2e9-32 3.514n ± 1% 3.292n ± 1% -6.30% (p=0.000 n=20) Float32-32 2.760n ± 1% 2.673n ± 1% -3.13% (p=0.000 n=20) Float64-32 2.284n ± 1% 2.485n ± 1% +8.80% (p=0.000 n=20) ExpFloat64-32 3.757n ± 1% 3.577n ± 2% -4.78% (p=0.000 n=20) NormFloat64-32 3.837n ± 1% 3.797n ± 2% ~ (p=0.204 n=20) Perm3-32 35.23n ± 2% 35.79n ± 2% ~ (p=0.298 n=20) Perm30-32 208.8n ± 1% 205.1n ± 1% -1.82% (p=0.000 n=20) Perm30ViaShuffle-32 111.7n ± 1% 111.2n ± 2% ~ (p=0.273 n=20) ShuffleOverhead-32 101.1n ± 1% 100.5n ± 2% ~ (p=0.878 n=20) Concurrent-32 2.108n ± 7% 2.188n ± 5% ~ (p=0.417 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 │ 220860f76f.arm64 │ 11ad9fdddc.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.316n ± 1% 2.272n ± 1% -1.86% (p=0.000 n=20) GlobalInt64-8 2.183n ± 1% 2.155n ± 1% ~ (p=0.122 n=20) GlobalInt63Parallel-8 0.4331n ± 0% 0.4352n ± 0% +0.48% (p=0.000 n=20) GlobalUint64-8 4.377n ± 2% 2.173n ± 1% -50.35% (p=0.000 n=20) GlobalUint64Parallel-8 0.9237n ± 0% 0.4340n ± 0% -53.02% (p=0.000 n=20) Int64-8 2.538n ± 1% 2.544n ± 1% ~ (p=0.189 n=20) Uint64-8 2.604n ± 1% 2.552n ± 1% -1.98% (p=0.000 n=20) GlobalIntN1000-8 3.857n ± 2% 3.856n ± 0% ~ (p=0.051 n=20) IntN1000-8 3.822n ± 2% 3.820n ± 0% -0.05% (p=0.001 n=20) Int64N1000-8 3.318n ± 0% 3.219n ± 2% -2.98% (p=0.000 n=20) Int64N1e8-8 3.349n ± 1% 3.221n ± 2% -3.79% (p=0.000 n=20) Int64N1e9-8 3.317n ± 2% 3.276n ± 2% -1.24% (p=0.001 n=20) Int64N2e9-8 3.317n ± 2% 3.217n ± 0% -3.01% (p=0.000 n=20) Int64N1e18-8 3.542n ± 1% 3.502n ± 2% -1.16% (p=0.001 n=20) Int64N2e18-8 5.087n ± 0% 4.968n ± 1% -2.33% (p=0.000 n=20) Int64N4e18-8 5.084n ± 0% 4.963n ± 0% -2.39% (p=0.000 n=20) Int32N1000-8 3.208n ± 2% 3.189n ± 1% -0.58% (p=0.001 n=20) Int32N1e8-8 3.610n ± 1% 3.514n ± 1% -2.67% (p=0.000 n=20) Int32N1e9-8 4.235n ± 0% 4.133n ± 0% -2.40% (p=0.000 n=20) Int32N2e9-8 4.229n ± 1% 4.137n ± 0% -2.19% (p=0.000 n=20) Float32-8 3.468n ± 0% 3.468n ± 1% ~ (p=0.350 n=20) Float64-8 3.447n ± 0% 3.478n ± 0% +0.90% (p=0.000 n=20) ExpFloat64-8 4.567n ± 0% 4.563n ± 0% -0.10% (p=0.002 n=20) NormFloat64-8 4.821n ± 0% 4.768n ± 0% -1.09% (p=0.000 n=20) Perm3-8 28.89n ± 0% 28.94n ± 0% +0.17% (p=0.000 n=20) Perm30-8 175.7n ± 0% 175.9n ± 0% +0.14% (p=0.000 n=20) Perm30ViaShuffle-8 153.5n ± 0% 152.6n ± 1% ~ (p=0.010 n=20) ShuffleOverhead-8 119.8n ± 1% 119.6n ± 1% ~ (p=0.147 n=20) Concurrent-8 2.433n ± 3% 2.452n ± 3% ~ (p=0.616 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.386 │ 11ad9fdddc.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.370n ± 1% 2.091n ± 1% -11.75% (p=0.000 n=20) GlobalInt64-32 3.569n ± 1% 3.514n ± 2% -1.56% (p=0.000 n=20) GlobalInt63Parallel-32 0.3221n ± 1% 0.3197n ± 0% -0.76% (p=0.000 n=20) GlobalUint64-32 8.797n ± 10% 3.542n ± 1% -59.74% (p=0.000 n=20) GlobalUint64Parallel-32 0.6351n ± 0% 0.3218n ± 0% -49.33% (p=0.000 n=20) Int64-32 2.612n ± 2% 2.552n ± 2% -2.30% (p=0.000 n=20) Uint64-32 3.350n ± 1% 2.566n ± 1% -23.42% (p=0.000 n=20) GlobalIntN1000-32 5.892n ± 1% 5.965n ± 2% ~ (p=0.082 n=20) IntN1000-32 4.546n ± 1% 4.652n ± 1% +2.33% (p=0.000 n=20) Int64N1000-32 14.59n ± 1% 14.48n ± 1% ~ (p=0.652 n=20) Int64N1e8-32 14.76n ± 2% 14.67n ± 1% ~ (p=0.836 n=20) Int64N1e9-32 16.57n ± 1% 16.80n ± 2% ~ (p=0.016 n=20) Int64N2e9-32 14.54n ± 1% 14.52n ± 1% ~ (p=0.533 n=20) Int64N1e18-32 16.14n ± 1% 16.16n ± 1% ~ (p=0.606 n=20) Int64N2e18-32 18.10n ± 1% 17.95n ± 2% ~ (p=0.062 n=20) Int64N4e18-32 18.65n ± 1% 18.35n ± 2% -1.61% (p=0.010 n=20) Int32N1000-32 3.560n ± 1% 3.608n ± 1% +1.33% (p=0.001 n=20) Int32N1e8-32 3.770n ± 2% 3.767n ± 1% ~ (p=0.155 n=20) Int32N1e9-32 4.098n ± 0% 4.130n ± 2% ~ (p=0.016 n=20) Int32N2e9-32 4.179n ± 1% 4.206n ± 1% ~ (p=0.011 n=20) Float32-32 21.18n ± 4% 22.18n ± 4% +4.70% (p=0.003 n=20) Float64-32 20.60n ± 2% 20.75n ± 4% +0.73% (p=0.000 n=20) ExpFloat64-32 13.07n ± 0% 12.58n ± 3% -3.82% (p=0.000 n=20) NormFloat64-32 7.738n ± 2% 7.920n ± 3% ~ (p=0.066 n=20) Perm3-32 36.73n ± 1% 40.27n ± 1% +9.65% (p=0.000 n=20) Perm30-32 211.9n ± 1% 213.2n ± 2% ~ (p=0.262 n=20) Perm30ViaShuffle-32 165.2n ± 1% 164.2n ± 2% ~ (p=0.029 n=20) ShuffleOverhead-32 133.9n ± 1% 134.7n ± 2% ~ (p=0.551 n=20) Concurrent-32 3.287n ± 2% 3.301n ± 2% ~ (p=0.330 n=20) For #61716. Change-Id: I8d2f73f87dd3603a0c2ff069988938e0957b6904 Reviewed-on: https://go-review.googlesource.com/c/go/+/502499 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Rob Pike <r@golang.org>	2023-10-30 17:08:34 +00:00
Russ Cox	1f4db9dbd6	math/rand/v2: update benchmarks Change the benchmarks to use the result of the calls, as I found that in certain cases inlining resulted in discarding part of the computation in the benchmark loop. Add various benchmarks that will be relevant in future CLs. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.amd64 │ │ sec/op │ SourceUint64-32 1.555n ± 1% GlobalInt64-32 2.071n ± 1% GlobalInt63Parallel-32 0.1023n ± 1% GlobalUint64-32 5.193n ± 1% GlobalUint64Parallel-32 0.2341n ± 0% Int64-32 2.056n ± 2% Uint64-32 2.077n ± 2% GlobalIntN1000-32 4.077n ± 2% IntN1000-32 3.476n ± 2% Int64N1000-32 3.059n ± 1% Int64N1e8-32 2.942n ± 1% Int64N1e9-32 2.932n ± 1% Int64N2e9-32 2.925n ± 1% Int64N1e18-32 3.116n ± 1% Int64N2e18-32 4.067n ± 1% Int64N4e18-32 4.054n ± 1% Int32N1000-32 2.951n ± 1% Int32N1e8-32 3.102n ± 1% Int32N1e9-32 3.535n ± 1% Int32N2e9-32 3.514n ± 1% Float32-32 2.760n ± 1% Float64-32 2.284n ± 1% ExpFloat64-32 3.757n ± 1% NormFloat64-32 3.837n ± 1% Perm3-32 35.23n ± 2% Perm30-32 208.8n ± 1% Perm30ViaShuffle-32 111.7n ± 1% ShuffleOverhead-32 101.1n ± 1% Concurrent-32 2.108n ± 7% goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 220860f76f.arm64 │ │ sec/op │ SourceUint64-8 2.316n ± 1% GlobalInt64-8 2.183n ± 1% GlobalInt63Parallel-8 0.4331n ± 0% GlobalUint64-8 4.377n ± 2% GlobalUint64Parallel-8 0.9237n ± 0% Int64-8 2.538n ± 1% Uint64-8 2.604n ± 1% GlobalIntN1000-8 3.857n ± 2% IntN1000-8 3.822n ± 2% Int64N1000-8 3.318n ± 0% Int64N1e8-8 3.349n ± 1% Int64N1e9-8 3.317n ± 2% Int64N2e9-8 3.317n ± 2% Int64N1e18-8 3.542n ± 1% Int64N2e18-8 5.087n ± 0% Int64N4e18-8 5.084n ± 0% Int32N1000-8 3.208n ± 2% Int32N1e8-8 3.610n ± 1% Int32N1e9-8 4.235n ± 0% Int32N2e9-8 4.229n ± 1% Float32-8 3.468n ± 0% Float64-8 3.447n ± 0% ExpFloat64-8 4.567n ± 0% NormFloat64-8 4.821n ± 0% Perm3-8 28.89n ± 0% Perm30-8 175.7n ± 0% Perm30ViaShuffle-8 153.5n ± 0% ShuffleOverhead-8 119.8n ± 1% Concurrent-8 2.433n ± 3% goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.386 │ │ sec/op │ SourceUint64-32 2.370n ± 1% GlobalInt64-32 3.569n ± 1% GlobalInt63Parallel-32 0.3221n ± 1% GlobalUint64-32 8.797n ± 10% GlobalUint64Parallel-32 0.6351n ± 0% Int64-32 2.612n ± 2% Uint64-32 3.350n ± 1% GlobalIntN1000-32 5.892n ± 1% IntN1000-32 4.546n ± 1% Int64N1000-32 14.59n ± 1% Int64N1e8-32 14.76n ± 2% Int64N1e9-32 16.57n ± 1% Int64N2e9-32 14.54n ± 1% Int64N1e18-32 16.14n ± 1% Int64N2e18-32 18.10n ± 1% Int64N4e18-32 18.65n ± 1% Int32N1000-32 3.560n ± 1% Int32N1e8-32 3.770n ± 2% Int32N1e9-32 4.098n ± 0% Int32N2e9-32 4.179n ± 1% Float32-32 21.18n ± 4% Float64-32 20.60n ± 2% ExpFloat64-32 13.07n ± 0% NormFloat64-32 7.738n ± 2% Perm3-32 36.73n ± 1% Perm30-32 211.9n ± 1% Perm30ViaShuffle-32 165.2n ± 1% ShuffleOverhead-32 133.9n ± 1% Concurrent-32 3.287n ± 2% For #61716. Change-Id: I2f0938eae4b7bf736a8cd899a99783e731bf2179 Reviewed-on: https://go-review.googlesource.com/c/go/+/502496 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Rob Pike <r@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2023-10-30 14:32:20 +00:00
Russ Cox	1cc5b34d28	math/rand/v2: remove Rand.Seed Removing Rand.Seed lets us remove lockedSource as well, along with the ambiguity in globalRand about which source to use. For #61716. Change-Id: Ibe150520dd1e7dd87165eacaebe9f0c2daeaedfd Reviewed-on: https://go-review.googlesource.com/c/go/+/502498 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Rob Pike <r@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org>	2023-10-30 14:31:46 +00:00
Russ Cox	48bd1fc93b	math/rand/v2: clean up regression test Add more test cases. Replace -printgolden with -update, which rewrites the files for us. For #61716. Change-Id: I7c4c900ee896042429135a21971a56ebe16b6a66 Reviewed-on: https://go-review.googlesource.com/c/go/+/516858 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2023-10-30 14:30:24 +00:00
Russ Cox	d6c1ef52ad	math/rand/v2: remove Read In math/rand, Read is deprecated. Remove in v2. People should use crypto/rand if they need long strings. For #61716. Change-Id: Ib254b7e1844616e96db60a3a7abb572b0dcb1583 Reviewed-on: https://go-review.googlesource.com/c/go/+/502497 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2023-10-30 14:30:14 +00:00
Russ Cox	d42750b17c	math/rand/v2: rename various functions Int31 -> Int32 Int31n -> Int32N Int63 -> Int64 Int63n -> Int64N Intn -> IntN The 31 and 63 are pedantic and confusing: the functions should be named for the type they return, same as all the others. The lower-case n is inconsistent with Go's usual CamelCase and especially problematic because we plan to add 'func N'. Capitalize the n. For #61716. Change-Id: Idb1a005a82f353677450d47fb612ade7a41fde69 Reviewed-on: https://go-review.googlesource.com/c/go/+/516857 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Robert Griesemer <gri@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2023-10-30 14:29:37 +00:00
Russ Cox	59f0ab4036	math/rand/v2: start of new API This is the beginning of the math/rand/v2 package from proposal #61716. Start by copying old API. This CL copies math/rand/* to math/rand/v2 and updates references to math/rand to add v2 throughout. Later CLs will make the v2 changes. For #61716. Change-Id: I1624ccffae3dfa442d4ba2461942decbd076e11b Reviewed-on: https://go-review.googlesource.com/c/go/+/502495 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Rob Pike <r@golang.org>	2023-10-30 14:29:30 +00:00
Dmitri Shuralyov	bf97e724b5	all: drop old +build lines Running 'go fix' on the cmd+std packages handled much of this change. Also update code generators to use only the new go:build lines, not the old +build ones. For #41184. For #60268. Change-Id: If35532abe3012e7357b02c79d5992ff5ac37ca23 Cq-Include-Trybots: luci.golang.try:gotip-linux-386-longtest,gotip-linux-amd64-longtest,gotip-windows-amd64-longtest Reviewed-on: https://go-review.googlesource.com/c/go/+/536237 Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2023-10-19 23:33:27 +00:00
cui fliter	d57303e65f	math: add available godoc link Change-Id: I4a6c2ef6fd21355952ab7d8eaad883646a95d364 Reviewed-on: https://go-review.googlesource.com/c/go/+/535087 Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Than McIntosh <thanm@google.com>	2023-10-19 11:59:09 +00:00
Oleksandr Redko	da8f406f06	all: simplify bool conditions Change-Id: Id2079f7012392dea8dfe2386bb9fb1ea3f487a4a Reviewed-on: https://go-review.googlesource.com/c/go/+/526015 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: qiulaidongfeng <2645477756@qq.com>	2023-09-20 18:06:13 +00:00
Dmitri Shuralyov	0dfb22ed70	all: use ^TestName$ regular pattern for invoking a single test Use ^ and $ in the -run flag regular expression value when the intention is to invoke a single named test. This removes the reliance on there not being another similarly named test to achieve the intended result. In particular, package syscall has tests named TestUnshareMountNameSpace and TestUnshareMountNameSpaceChroot that both trigger themselves setting GO_WANT_HELPER_PROCESS=1 to run alternate code in a helper process. As a consequence of overlap in their test names, the former was inadvertently triggering one too many helpers. Spotted while reviewing CL 525196. Apply the same change in other places to make it easier for code readers to see that said tests aren't running extraneous tests. The unlikely cases of -run=TestSomething intentionally being used to run all tests that have the TestSomething substring in the name can be better written as -run=^.TestSomething.$ or with a comment so it is clear it wasn't an oversight. Change-Id: Iba208aba3998acdbf8c6708e5d23ab88938bfc1e Reviewed-on: https://go-review.googlesource.com/c/go/+/524948 Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Kirill Kolyshkin <kolyshkin@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2023-09-05 23:35:29 +00:00
chanxuehong	aaa384cf3a	math/big, math/rand: use the built-in max function Change-Id: I71a38dd20bfaf2b1aed18892d54eeb017d3d7d66 GitHub-Last-Rev: `8da43b2cbd` GitHub-Pull-Request: golang/go#61955 Reviewed-on: https://go-review.googlesource.com/c/go/+/518595 Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: qiulaidongfeng <2645477756@qq.com>	2023-08-17 16:42:19 +00:00
qiulaidongfeng	1d3a77e5e6	math/big: using the min built-in function Change-Id: I9e95806116a8547ec782f66226d1b1382c6156de Change-Id: I9e95806116a8547ec782f66226d1b1382c6156de GitHub-Last-Rev: `5b4ce994c1` GitHub-Pull-Request: golang/go#61829 Reviewed-on: https://go-review.googlesource.com/c/go/+/516895 Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com>	2023-08-11 02:52:49 +00:00
Srinivas Pokala	a37da52d75	math: enable huge argument tests on s390x new s390x assembly implementation of Sin/Cos/SinCos/Tan handle huge argument test's. Updates #29240 Change-Id: I9f22d9714528ef2af52c749079f3727250089baf Reviewed-on: https://go-review.googlesource.com/c/go/+/509675 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2023-07-31 16:23:41 +00:00
Srinivas Pokala	20ea988421	math: huge argument handling for sin/cos in s390x Currently s390x, sin/cos assembly implementation not handling huge arguments. This change reverts assembly routine to native go implementation for huge arguments. Implementing the changes in assembly giving better performance than native go changes in terms of execution/cycles. name Go_changes Asm_changes Sin/input_size_(0.5)-8 11.85ns ± 0% 5.32ns ± 1% Sin/input_size_(1<<20)-8 15.32ns ± 0% 9.75ns ± 3% Sin/input_size_(1<<_40)-8 17.9ns ± 0% 10.3ns ± 6% Sin/input_size_(1<<50)-8 16.33ns ± 0% 9.75ns ± 6% Sin/input_size_(1<<60)-8 33.0ns ± 1% 29.1ns ± 0% Sin/input_size_(1<<80)-8 29.9ns ± 0% 27.2ns ± 2% Sin/input_size_(1<<200)-8 31.5ns ± 1% 28.3ns ± 0% Sin/input_size_(1<<480)-8 29.4ns ± 1% 28.0ns ± 1% Sin/input_size_(1234567891234567_<<_180)-8 29.3ns ± 1% 28.0ns ± 0% Cos/input_size_(0.5)-8 10.33ns ± 0% 5.69ns ± 1% Cos/input_size_(1<<20)-8 16.67ns ± 0% 9.18ns ± 0% Cos/input_size_(1<<_40)-8 18.50ns ± 0% 9.45ns ± 3% Cos/input_size_(1<<50)-8 16.67ns ± 0% 9.18ns ± 1% Cos/input_size_(1<<60)-8 31.6ns ± 1% 26.7ns ± 2% Cos/input_size_(1<<80)-8 31.3ns ± 0% 25.5ns ± 1% Cos/input_size_(1<<200)-8 30.0ns ± 0% 26.7ns ± 1% Cos/input_size_(1<<480)-8 31.9ns ±2% 27.0ns ± 0% Cos/input_size_(1234567891234567_<<_180)-8 31.8ns ± 0% 26.9ns ± 0% Fixes #29240 Change-Id: Id2ebcfa113926f27510d527e80daaddad925a707 Reviewed-on: https://go-review.googlesource.com/c/go/+/469635 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Bill O'Farrell <billotosyr@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Heschi Kreinick <heschi@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2023-07-31 04:25:54 +00:00
root	a8a6f90a23	math: support to handle huge arguments in tan function on s390x Currently on s390x, tan assembly implementation is not handling huge arguments at all. This change is to check for large arguments and revert back to native go implantation from assembly code in case of huge arguments. The changes are implemented in assembly code to get better performance over native go implementation. Benchmark details of tan function with table driven inputs are updated as part of the issue link. Fixes #37854 Change-Id: I4e5321e65c27b7ce8c497fc9d3991ca8604753d2 Reviewed-on: https://go-review.googlesource.com/c/go/+/470595 Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Ian Lance Taylor <iant@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Keith Randall <khr@golang.org>	2023-07-27 23:30:00 +00:00
Keith Randall	9b33543339	math: test large negative values as args for trig functions Sin/Tan are odd, Cos is even, so it is easy to compute the correct result from the positive argument case. Change-Id: If851d00fc7f515ece8199cf56d21186ced51e94f Reviewed-on: https://go-review.googlesource.com/c/go/+/509815 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Srinivas Pokala <Pokala.Srinivas@ibm.com> Reviewed-by: Robert Griesemer <gri@google.com> Reviewed-by: Keith Randall <khr@google.com>	2023-07-17 21:05:34 +00:00
Michael Munday	158d11196f	math: add test that covers riscv64 fnm{add,sub} codegen Adds a test that triggers the RISC-V fused multiply-add code generation bug fixed by CL 506575. Change-Id: Ia3a55a68b48c5cc6beac4e5235975dea31f3faf2 Reviewed-on: https://go-review.googlesource.com/c/go/+/507035 Auto-Submit: M Zhuo <mzh@golangcn.org> Reviewed-by: M Zhuo <mzh@golangcn.org> Run-TryBot: Michael Munday <mike.munday@lowrisc.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Joedian Reid <joedian@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2023-07-07 17:39:26 +00:00
Michael Munday	c8dad424bf	math: fix portable FMA when xy < 0 and xy == -z When xy == -z the portable implementation of FMA copied the sign bit from xy into the result. This meant that when xy == -z and xy < 0 the result was -0 which is incorrect. Fixes #61130. Change-Id: Ib93a568b7bdb9031e2aedfa1bdfa9bddde90851d Reviewed-on: https://go-review.googlesource.com/c/go/+/507376 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Michael Munday <mike.munday@lowrisc.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Joedian Reid <joedian@golang.org>	2023-07-05 22:05:30 +00:00
Ian Lance Taylor	65db95d0ed	math: document that Min/Max differ from min/max For #59488 Fixes #60616 Change-Id: Idf9f42d7d868999664652dd7b478684a474f1d96 Reviewed-on: https://go-review.googlesource.com/c/go/+/501355 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Rob Pike <r@golang.org> Run-TryBot: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org>	2023-06-15 19:45:12 +00:00
Alexander Yastrebov	8ffc931eae	all: fix spelling errors Fix spelling errors discovered using https://github.com/codespell-project/codespell. Errors in data files and vendored packages are ignored. Change-Id: I83c7818222f2eea69afbd270c15b7897678131dc GitHub-Last-Rev: `3491615b1b` GitHub-Pull-Request: golang/go#60758 Reviewed-on: https://go-review.googlesource.com/c/go/+/502576 Auto-Submit: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Pratt <mpratt@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2023-06-14 00:03:57 +00:00
cui fliter	8b53c2d2fc	all: fix mismatched symbols There are some symbol mismatches in the comments, this commit attempts to fix them Change-Id: I5c9075e5218defe9233c075744d243b26ff68496 Reviewed-on: https://go-review.googlesource.com/c/go/+/492996 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: shuang cui <imcusg@gmail.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Pratt <mpratt@google.com>	2023-06-13 20:02:49 +00:00
Alan Donovan	fdbc66d6dd	math/big: rename Int.ToFloat64 to Float64 The "To" prefix was a relic of the first draft that I failed to make consistent with the unprefixed name used in the proposal. Fortunately iant spotted it during the API audit. Updates #56984 Updates #60560 Change-Id: Ifa6eeddf6dd5f0637c0568e383f9a4bef88b10f9 Reviewed-on: https://go-review.googlesource.com/c/go/+/500116 Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Alan Donovan <adonovan@google.com>	2023-06-02 14:22:24 +00:00
Ian Lance Taylor	bf14663943	Revert "math: add Compare and Compare32" This reverts CL 467515. Now that we have cmp.Compare, we don't need math.Compare or math.Compare32 after all. For #56491 Fixes #60519 Change-Id: I8ed33464adfc6d69bd6b328edb26aa2ee3d234d9 Reviewed-on: https://go-review.googlesource.com/c/go/+/499416 Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Eli Bendersky <eliben@google.com>	2023-05-31 21:19:39 +00:00
Egon Elbre	74af79bcf6	fmt,math/big,net/url: fixes to old Benchmarks b.ResetTimer used to also stop the timer, however it does not anymore. These benchmarks hadn't been fixed and as a result ended up measuring some additional things. Also, make some for loops more conventional. Change-Id: I76ca68456d85eec51722a80587e5b2c9f5d836a1 Reviewed-on: https://go-review.googlesource.com/c/go/+/496996 Run-TryBot: Damien Neil <dneil@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Auto-Submit: Damien Neil <dneil@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Damien Neil <dneil@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com>	2023-05-23 20:25:13 +00:00
cui fliter	57e3189821	all: fix a lot of comments Fix comments, including duplicate is, wrong phrases and articles, misspellings, etc. Change-Id: I8bfea53b9b275e649757cc4bee6a8a026ed9c7a4 Reviewed-on: https://go-review.googlesource.com/c/go/+/493035 Reviewed-by: Benny Siegert <bsiegert@gmail.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: shuang cui <imcusg@gmail.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com>	2023-05-10 12:59:20 +00:00
Lynn Boger	e23322e2cc	cmd/internal/obj/ppc64: modify PCALIGN to ensure alignment The initial purpose of PCALIGN was to identify code where it would be beneficial to align code for performance, but avoid cases where too many NOPs were added. On p10, it is now necessary to enforce a certain alignment in some cases, so the behavior of PCALIGN needs to be slightly different. Code will now be aligned to the value specified on the PCALIGN instruction regardless of number of NOPs added, which is more intuitive and consistent with power assembler alignment directives. This also adds 64 as a possible alignment value. The existing values used in PCALIGN were modified according to the new behavior. A testcase was updated and performance testing was done to verify that this does not adversely affect performance. Change-Id: Iad1cf5ff112e5bfc0514f0805be90e24095e932b Reviewed-on: https://go-review.googlesource.com/c/go/+/485056 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: Archana Ravindar <aravind5@in.ibm.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Paul Murphy <murp@ibm.com> Reviewed-by: Bryan Mills <bcmills@google.com>	2023-04-21 16:47:45 +00:00
Ian Lance Taylor	9a0c506a4e	all: re-run stringer Re-run all go:generate stringer commands. This mostly adds checks that the constant values did not change, but does add new strings for the debug/dwarf and internal/pkgbits packages. Change-Id: I5fc41f20da47338152c183d45d5ae65074e2fccf Reviewed-on: https://go-review.googlesource.com/c/go/+/483717 Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>	2023-04-11 20:24:07 +00:00
Ian Lance Taylor	6d2cac12db	math/rand: clarify Seed deprecation note Fixes #59331 Change-Id: I62156be2f2758c59349c3b02db6cf9140429c9e3 Reviewed-on: https://go-review.googlesource.com/c/go/+/481915 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Bypass: Ian Lance Taylor <iant@google.com> Reviewed-by: Russ Cox <rsc@golang.org>	2023-04-04 20:18:09 +00:00

1 2 3 4 5 ...

710 Commits