Commit Graph

8 Commits

Author SHA1 Message Date
Russ Cox 8abde68f19 math/rand/v2: delete Mitchell/Reeds source
These slowdowns are because we are now using PCG instead of the
Mitchell/Reeds LFSR for the benchmarks. PCG is in fact a bit slower
(but generates statically far better random numbers).

goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 01ff938549.amd64 │           afa459a2f0.amd64           │
                        │      sec/op      │    sec/op     vs base                │
PCG_DXSM-32                    1.490n ± 0%    1.488n ± 2%        ~ (p=0.408 n=20)
SourceUint64-32                1.352n ± 1%    1.450n ± 3%   +7.21% (p=0.000 n=20)
GlobalInt64-32                 2.083n ± 0%    2.067n ± 2%        ~ (p=0.223 n=20)
GlobalInt64Parallel-32        0.1035n ± 1%   0.1044n ± 2%        ~ (p=0.010 n=20)
GlobalUint64-32                2.038n ± 1%    2.085n ± 0%   +2.28% (p=0.000 n=20)
GlobalUint64Parallel-32       0.1006n ± 1%   0.1008n ± 1%        ~ (p=0.733 n=20)
Int64-32                       1.687n ± 2%    1.779n ± 1%   +5.48% (p=0.000 n=20)
Uint64-32                      1.674n ± 2%    1.854n ± 2%  +10.69% (p=0.000 n=20)
GlobalIntN1000-32              3.135n ± 1%    3.140n ± 3%        ~ (p=0.794 n=20)
IntN1000-32                    2.478n ± 1%    2.496n ± 1%   +0.73% (p=0.006 n=20)
Int64N1000-32                  2.455n ± 1%    2.510n ± 2%   +2.22% (p=0.000 n=20)
Int64N1e8-32                   2.467n ± 2%    2.471n ± 2%        ~ (p=0.050 n=20)
Int64N1e9-32                   2.454n ± 1%    2.488n ± 2%   +1.39% (p=0.000 n=20)
Int64N2e9-32                   2.482n ± 1%    2.478n ± 2%        ~ (p=0.066 n=20)
Int64N1e18-32                  3.349n ± 2%    3.088n ± 1%   -7.81% (p=0.000 n=20)
Int64N2e18-32                  3.537n ± 1%    3.493n ± 1%   -1.24% (p=0.002 n=20)
Int64N4e18-32                  4.917n ± 0%    5.060n ± 2%   +2.91% (p=0.000 n=20)
Int32N1000-32                  2.386n ± 1%    2.620n ± 1%   +9.76% (p=0.000 n=20)
Int32N1e8-32                   2.366n ± 1%    2.652n ± 0%  +12.11% (p=0.000 n=20)
Int32N1e9-32                   2.355n ± 2%    2.644n ± 1%  +12.32% (p=0.000 n=20)
Int32N2e9-32                   2.371n ± 1%    2.619n ± 2%  +10.48% (p=0.000 n=20)
Float32-32                     2.245n ± 2%    2.261n ± 1%        ~ (p=0.625 n=20)
Float64-32                     2.235n ± 1%    2.241n ± 2%        ~ (p=0.393 n=20)
ExpFloat64-32                  3.813n ± 3%    3.716n ± 1%   -2.53% (p=0.000 n=20)
NormFloat64-32                 3.652n ± 2%    3.718n ± 1%   +1.79% (p=0.006 n=20)
Perm3-32                       33.12n ± 3%    34.11n ± 2%        ~ (p=0.021 n=20)
Perm30-32                      205.1n ± 1%    200.6n ± 0%   -2.17% (p=0.000 n=20)
Perm30ViaShuffle-32            110.8n ± 1%    109.7n ± 1%   -0.99% (p=0.002 n=20)
ShuffleOverhead-32             113.0n ± 1%    107.2n ± 1%   -5.09% (p=0.000 n=20)
Concurrent-32                  2.100n ± 0%    2.108n ± 6%        ~ (p=0.103 n=20)

goos: darwin
goarch: arm64
pkg: math/rand/v2
                       │ 01ff938549.arm64 │           afa459a2f0.arm64           │
                       │      sec/op      │    sec/op     vs base                │
PCG_DXSM-8                    2.531n ± 0%    2.531n ± 0%        ~ (p=0.763 n=20)
SourceUint64-8                2.258n ± 1%    2.531n ± 0%  +12.09% (p=0.000 n=20)
GlobalInt64-8                 2.167n ± 0%    2.177n ± 1%        ~ (p=0.213 n=20)
GlobalInt64Parallel-8        0.4310n ± 0%   0.4319n ± 0%        ~ (p=0.027 n=20)
GlobalUint64-8                2.182n ± 1%    2.185n ± 1%        ~ (p=0.683 n=20)
GlobalUint64Parallel-8       0.4297n ± 0%   0.4295n ± 1%        ~ (p=0.941 n=20)
Int64-8                       2.472n ± 1%    4.104n ± 0%  +66.00% (p=0.000 n=20)
Uint64-8                      2.449n ± 1%    4.080n ± 0%  +66.60% (p=0.000 n=20)
GlobalIntN1000-8              2.814n ± 2%    2.814n ± 1%        ~ (p=0.972 n=20)
IntN1000-8                    2.998n ± 2%    4.140n ± 0%  +38.09% (p=0.000 n=20)
Int64N1000-8                  2.949n ± 2%    4.139n ± 0%  +40.35% (p=0.000 n=20)
Int64N1e8-8                   2.953n ± 2%    4.140n ± 0%  +40.22% (p=0.000 n=20)
Int64N1e9-8                   2.950n ± 0%    4.139n ± 0%  +40.32% (p=0.000 n=20)
Int64N2e9-8                   2.946n ± 2%    4.140n ± 0%  +40.53% (p=0.000 n=20)
Int64N1e18-8                  3.779n ± 1%    5.273n ± 0%  +39.52% (p=0.000 n=20)
Int64N2e18-8                  4.370n ± 1%    6.059n ± 0%  +38.65% (p=0.000 n=20)
Int64N4e18-8                  6.544n ± 1%    8.803n ± 0%  +34.52% (p=0.000 n=20)
Int32N1000-8                  2.950n ± 0%    4.131n ± 0%  +40.06% (p=0.000 n=20)
Int32N1e8-8                   2.950n ± 2%    4.131n ± 0%  +40.03% (p=0.000 n=20)
Int32N1e9-8                   2.951n ± 2%    4.131n ± 0%  +39.99% (p=0.000 n=20)
Int32N2e9-8                   2.950n ± 2%    4.131n ± 0%  +40.03% (p=0.000 n=20)
Float32-8                     3.441n ± 0%    4.110n ± 0%  +19.44% (p=0.000 n=20)
Float64-8                     3.442n ± 0%    4.104n ± 0%  +19.24% (p=0.000 n=20)
ExpFloat64-8                  4.481n ± 0%    5.338n ± 0%  +19.11% (p=0.000 n=20)
NormFloat64-8                 4.725n ± 0%    5.731n ± 0%  +21.28% (p=0.000 n=20)
Perm3-8                       26.55n ± 0%    26.62n ± 0%   +0.28% (p=0.000 n=20)
Perm30-8                      181.9n ± 0%    194.6n ± 2%   +6.98% (p=0.000 n=20)
Perm30ViaShuffle-8            142.9n ± 0%    156.4n ± 0%   +9.45% (p=0.000 n=20)
ShuffleOverhead-8             120.8n ± 2%    125.8n ± 0%   +4.10% (p=0.000 n=20)
Concurrent-8                  2.421n ± 6%    2.654n ± 6%   +9.67% (p=0.002 n=20)

goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 01ff938549.386 │            afa459a2f0.386             │
                        │     sec/op     │    sec/op     vs base                 │
PCG_DXSM-32                  7.613n ± 1%    7.793n ± 2%    +2.38% (p=0.000 n=20)
SourceUint64-32              2.069n ± 0%    7.680n ± 1%  +271.19% (p=0.000 n=20)
GlobalInt64-32               3.456n ± 1%    3.474n ± 3%         ~ (p=0.654 n=20)
GlobalInt64Parallel-32      0.3252n ± 0%   0.3253n ± 0%         ~ (p=0.952 n=20)
GlobalUint64-32              3.573n ± 1%    3.433n ± 2%    -3.92% (p=0.000 n=20)
GlobalUint64Parallel-32     0.3159n ± 0%   0.3156n ± 0%         ~ (p=0.223 n=20)
Int64-32                     2.562n ± 2%    7.707n ± 1%  +200.74% (p=0.000 n=20)
Uint64-32                    2.592n ± 0%    7.714n ± 1%  +197.65% (p=0.000 n=20)
GlobalIntN1000-32            6.266n ± 2%    6.236n ± 1%         ~ (p=0.039 n=20)
IntN1000-32                  4.724n ± 2%   10.410n ± 1%  +120.39% (p=0.000 n=20)
Int64N1000-32                5.490n ± 2%   10.975n ± 2%   +99.89% (p=0.000 n=20)
Int64N1e8-32                 5.513n ± 2%   10.980n ± 1%   +99.15% (p=0.000 n=20)
Int64N1e9-32                 5.476n ± 1%   10.950n ± 0%   +99.96% (p=0.000 n=20)
Int64N2e9-32                 5.501n ± 2%   11.110n ± 1%  +101.96% (p=0.000 n=20)
Int64N1e18-32                9.043n ± 2%   15.180n ± 2%   +67.86% (p=0.000 n=20)
Int64N2e18-32                9.601n ± 2%   15.610n ± 1%   +62.60% (p=0.000 n=20)
Int64N4e18-32                12.00n ± 1%    19.23n ± 2%   +60.14% (p=0.000 n=20)
Int32N1000-32                4.829n ± 2%   10.345n ± 1%  +114.25% (p=0.000 n=20)
Int32N1e8-32                 4.825n ± 2%   10.330n ± 1%  +114.09% (p=0.000 n=20)
Int32N1e9-32                 4.830n ± 2%   10.350n ± 1%  +114.26% (p=0.000 n=20)
Int32N2e9-32                 4.750n ± 2%   10.345n ± 1%  +117.81% (p=0.000 n=20)
Float32-32                   10.89n ± 4%    13.57n ± 1%   +24.61% (p=0.000 n=20)
Float64-32                   19.60n ± 4%    22.95n ± 4%   +17.12% (p=0.000 n=20)
ExpFloat64-32                12.96n ± 3%    15.23n ± 2%   +17.47% (p=0.000 n=20)
NormFloat64-32               7.516n ± 1%   13.780n ± 1%   +83.34% (p=0.000 n=20)
Perm3-32                     36.78n ± 2%    46.62n ± 2%   +26.72% (p=0.000 n=20)
Perm30-32                    238.9n ± 2%    400.7n ± 1%   +67.73% (p=0.000 n=20)
Perm30ViaShuffle-32          189.7n ± 2%    350.5n ± 1%   +84.79% (p=0.000 n=20)
ShuffleOverhead-32           159.8n ± 1%    326.0n ± 2%  +104.01% (p=0.000 n=20)
Concurrent-32                3.286n ± 1%    3.290n ± 0%         ~ (p=0.743 n=20)

On the other hand, compared to the original "update benchmarks" CL,
the cleanups we've made more than compensate for PCG being a bit
slower than LFSR, at least on 64-bit x86. ARM64 (Apple M1) is a bit
slower: perhaps the 64x64→128 multiply is slower there for some reason.
386 is noticeably slower, but it's also a non-SSA backend.

goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 220860f76f.amd64 │            afa459a2f0.amd64            │
                        │      sec/op      │    sec/op     vs base                  │
SourceUint64-32                1.555n ± 1%    1.450n ± 3%   -6.78% (p=0.000 n=20)
GlobalInt64-32                 2.071n ± 1%    2.067n ± 2%        ~ (p=0.673 n=20)
GlobalInt63Parallel-32        0.1023n ± 1%
GlobalInt64Parallel-32                       0.1044n ± 2%
GlobalUint64-32                5.193n ± 1%    2.085n ± 0%  -59.86% (p=0.000 n=20)
GlobalUint64Parallel-32       0.2341n ± 0%   0.1008n ± 1%  -56.93% (p=0.000 n=20)
Int64-32                       2.056n ± 2%    1.779n ± 1%  -13.47% (p=0.000 n=20)
Uint64-32                      2.077n ± 2%    1.854n ± 2%  -10.74% (p=0.000 n=20)
GlobalIntN1000-32              4.077n ± 2%    3.140n ± 3%  -22.98% (p=0.000 n=20)
IntN1000-32                    3.476n ± 2%    2.496n ± 1%  -28.19% (p=0.000 n=20)
Int64N1000-32                  3.059n ± 1%    2.510n ± 2%  -17.96% (p=0.000 n=20)
Int64N1e8-32                   2.942n ± 1%    2.471n ± 2%  -15.98% (p=0.000 n=20)
Int64N1e9-32                   2.932n ± 1%    2.488n ± 2%  -15.14% (p=0.000 n=20)
Int64N2e9-32                   2.925n ± 1%    2.478n ± 2%  -15.30% (p=0.000 n=20)
Int64N1e18-32                  3.116n ± 1%    3.088n ± 1%        ~ (p=0.013 n=20)
Int64N2e18-32                  4.067n ± 1%    3.493n ± 1%  -14.11% (p=0.000 n=20)
Int64N4e18-32                  4.054n ± 1%    5.060n ± 2%  +24.80% (p=0.000 n=20)
Int32N1000-32                  2.951n ± 1%    2.620n ± 1%  -11.22% (p=0.000 n=20)
Int32N1e8-32                   3.102n ± 1%    2.652n ± 0%  -14.50% (p=0.000 n=20)
Int32N1e9-32                   3.535n ± 1%    2.644n ± 1%  -25.20% (p=0.000 n=20)
Int32N2e9-32                   3.514n ± 1%    2.619n ± 2%  -25.47% (p=0.000 n=20)
Float32-32                     2.760n ± 1%    2.261n ± 1%  -18.06% (p=0.000 n=20)
Float64-32                     2.284n ± 1%    2.241n ± 2%        ~ (p=0.016 n=20)
ExpFloat64-32                  3.757n ± 1%    3.716n ± 1%        ~ (p=0.034 n=20)
NormFloat64-32                 3.837n ± 1%    3.718n ± 1%   -3.09% (p=0.000 n=20)
Perm3-32                       35.23n ± 2%    34.11n ± 2%   -3.19% (p=0.000 n=20)
Perm30-32                      208.8n ± 1%    200.6n ± 0%   -3.93% (p=0.000 n=20)
Perm30ViaShuffle-32            111.7n ± 1%    109.7n ± 1%   -1.84% (p=0.000 n=20)
ShuffleOverhead-32             101.1n ± 1%    107.2n ± 1%   +6.03% (p=0.000 n=20)
Concurrent-32                  2.108n ± 7%    2.108n ± 6%        ~ (p=0.644 n=20)
PCG_DXSM-32                                   1.488n ± 2%

goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
                       │ 220860f76f.arm64 │            afa459a2f0.arm64            │
                       │      sec/op      │    sec/op     vs base                  │
SourceUint64-8                2.316n ± 1%    2.531n ± 0%   +9.33% (p=0.000 n=20)
GlobalInt64-8                 2.183n ± 1%    2.177n ± 1%        ~ (p=0.533 n=20)
GlobalInt63Parallel-8        0.4331n ± 0%
GlobalInt64Parallel-8                       0.4319n ± 0%
GlobalUint64-8                4.377n ± 2%    2.185n ± 1%  -50.07% (p=0.000 n=20)
GlobalUint64Parallel-8       0.9237n ± 0%   0.4295n ± 1%  -53.50% (p=0.000 n=20)
Int64-8                       2.538n ± 1%    4.104n ± 0%  +61.68% (p=0.000 n=20)
Uint64-8                      2.604n ± 1%    4.080n ± 0%  +56.68% (p=0.000 n=20)
GlobalIntN1000-8              3.857n ± 2%    2.814n ± 1%  -27.04% (p=0.000 n=20)
IntN1000-8                    3.822n ± 2%    4.140n ± 0%   +8.32% (p=0.000 n=20)
Int64N1000-8                  3.318n ± 0%    4.139n ± 0%  +24.74% (p=0.000 n=20)
Int64N1e8-8                   3.349n ± 1%    4.140n ± 0%  +23.64% (p=0.000 n=20)
Int64N1e9-8                   3.317n ± 2%    4.139n ± 0%  +24.80% (p=0.000 n=20)
Int64N2e9-8                   3.317n ± 2%    4.140n ± 0%  +24.81% (p=0.000 n=20)
Int64N1e18-8                  3.542n ± 1%    5.273n ± 0%  +48.85% (p=0.000 n=20)
Int64N2e18-8                  5.087n ± 0%    6.059n ± 0%  +19.12% (p=0.000 n=20)
Int64N4e18-8                  5.084n ± 0%    8.803n ± 0%  +73.16% (p=0.000 n=20)
Int32N1000-8                  3.208n ± 2%    4.131n ± 0%  +28.79% (p=0.000 n=20)
Int32N1e8-8                   3.610n ± 1%    4.131n ± 0%  +14.43% (p=0.000 n=20)
Int32N1e9-8                   4.235n ± 0%    4.131n ± 0%   -2.44% (p=0.000 n=20)
Int32N2e9-8                   4.229n ± 1%    4.131n ± 0%   -2.33% (p=0.000 n=20)
Float32-8                     3.468n ± 0%    4.110n ± 0%  +18.50% (p=0.000 n=20)
Float64-8                     3.447n ± 0%    4.104n ± 0%  +19.05% (p=0.000 n=20)
ExpFloat64-8                  4.567n ± 0%    5.338n ± 0%  +16.86% (p=0.000 n=20)
NormFloat64-8                 4.821n ± 0%    5.731n ± 0%  +18.89% (p=0.000 n=20)
Perm3-8                       28.89n ± 0%    26.62n ± 0%   -7.84% (p=0.000 n=20)
Perm30-8                      175.7n ± 0%    194.6n ± 2%  +10.76% (p=0.000 n=20)
Perm30ViaShuffle-8            153.5n ± 0%    156.4n ± 0%   +1.86% (p=0.000 n=20)
ShuffleOverhead-8             119.8n ± 1%    125.8n ± 0%   +4.97% (p=0.000 n=20)
Concurrent-8                  2.433n ± 3%    2.654n ± 6%   +9.13% (p=0.001 n=20)
PCG_DXSM-8                                   2.531n ± 0%

goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 220860f76f.386 │             afa459a2f0.386              │
                        │     sec/op     │    sec/op     vs base                   │
SourceUint64-32             2.370n ±  1%    7.680n ± 1%  +224.05% (p=0.000 n=20)
GlobalInt64-32              3.569n ±  1%    3.474n ± 3%    -2.66% (p=0.001 n=20)
GlobalInt63Parallel-32     0.3221n ±  1%
GlobalInt64Parallel-32                     0.3253n ± 0%
GlobalUint64-32             8.797n ± 10%    3.433n ± 2%   -60.98% (p=0.000 n=20)
GlobalUint64Parallel-32    0.6351n ±  0%   0.3156n ± 0%   -50.31% (p=0.000 n=20)
Int64-32                    2.612n ±  2%    7.707n ± 1%  +195.04% (p=0.000 n=20)
Uint64-32                   3.350n ±  1%    7.714n ± 1%  +130.25% (p=0.000 n=20)
GlobalIntN1000-32           5.892n ±  1%    6.236n ± 1%    +5.82% (p=0.000 n=20)
IntN1000-32                 4.546n ±  1%   10.410n ± 1%  +128.97% (p=0.000 n=20)
Int64N1000-32               14.59n ±  1%    10.97n ± 2%   -24.75% (p=0.000 n=20)
Int64N1e8-32                14.76n ±  2%    10.98n ± 1%   -25.58% (p=0.000 n=20)
Int64N1e9-32                16.57n ±  1%    10.95n ± 0%   -33.90% (p=0.000 n=20)
Int64N2e9-32                14.54n ±  1%    11.11n ± 1%   -23.62% (p=0.000 n=20)
Int64N1e18-32               16.14n ±  1%    15.18n ± 2%    -5.95% (p=0.000 n=20)
Int64N2e18-32               18.10n ±  1%    15.61n ± 1%   -13.73% (p=0.000 n=20)
Int64N4e18-32               18.65n ±  1%    19.23n ± 2%    +3.08% (p=0.000 n=20)
Int32N1000-32               3.560n ±  1%   10.345n ± 1%  +190.55% (p=0.000 n=20)
Int32N1e8-32                3.770n ±  2%   10.330n ± 1%  +174.01% (p=0.000 n=20)
Int32N1e9-32                4.098n ±  0%   10.350n ± 1%  +152.53% (p=0.000 n=20)
Int32N2e9-32                4.179n ±  1%   10.345n ± 1%  +147.52% (p=0.000 n=20)
Float32-32                  21.18n ±  4%    13.57n ± 1%   -35.93% (p=0.000 n=20)
Float64-32                  20.60n ±  2%    22.95n ± 4%   +11.41% (p=0.000 n=20)
ExpFloat64-32               13.07n ±  0%    15.23n ± 2%   +16.48% (p=0.000 n=20)
NormFloat64-32              7.738n ±  2%   13.780n ± 1%   +78.08% (p=0.000 n=20)
Perm3-32                    36.73n ±  1%    46.62n ± 2%   +26.91% (p=0.000 n=20)
Perm30-32                   211.9n ±  1%    400.7n ± 1%   +89.05% (p=0.000 n=20)
Perm30ViaShuffle-32         165.2n ±  1%    350.5n ± 1%  +112.20% (p=0.000 n=20)
ShuffleOverhead-32          133.9n ±  1%    326.0n ± 2%  +143.37% (p=0.000 n=20)
Concurrent-32               3.287n ±  2%    3.290n ± 0%         ~ (p=0.365 n=20)
PCG_DXSM-32                                 7.793n ± 2%

For #61716.

Change-Id: I4e9c0525b5f84a2ac46f23da9e365495e2d05777
Reviewed-on: https://go-review.googlesource.com/c/go/+/502506
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-30 17:09:26 +00:00
Russ Cox 8631fcbf31 math/rand/v2: add PCG-DXSM
For the original math/rand, we ported Plan 9's random number
generator, which was a refinement by Ken Thompson of an algorithm
by Don Mitchell and Jim Reeds, which Mitchell in turn recalls as
having been derived from an algorithm by Marsaglia. At its core,
it is an additive lagged Fibonacci generator (ALFG).

Whatever the details of the history, this generator is nowhere
near the current state of the art for simple, pseudo-random
generators.

This CL adds an implementation of Melissa O'Neill's PCG, specifically
the variant PCG-DXSM, which she defined after writing the PCG paper
and which is now the default in Numpy. The update is slightly slower
(a few multiplies and adds, instead of a few adds), but the state
is dramatically smaller (2 words instead of 607). The statistical
output properties are better too.

A followup CL will delete the old generator.

PCG is the only change here, so no benchmarks should be affected.
Including them anyway as further evidence for caution.

goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 8993506f2f.amd64 │           01ff938549.amd64           │
                        │      sec/op      │    sec/op     vs base                │
SourceUint64-32                1.325n ± 1%    1.352n ± 1%   +2.00% (p=0.000 n=20)
GlobalInt64-32                 2.240n ± 1%    2.083n ± 0%   -7.03% (p=0.000 n=20)
GlobalInt64Parallel-32        0.1041n ± 1%   0.1035n ± 1%        ~ (p=0.064 n=20)
GlobalUint64-32                2.072n ± 3%    2.038n ± 1%        ~ (p=0.089 n=20)
GlobalUint64Parallel-32       0.1008n ± 1%   0.1006n ± 1%        ~ (p=0.804 n=20)
Int64-32                       1.716n ± 1%    1.687n ± 2%        ~ (p=0.045 n=20)
Uint64-32                      1.665n ± 1%    1.674n ± 2%        ~ (p=0.878 n=20)
GlobalIntN1000-32              3.335n ± 1%    3.135n ± 1%   -6.00% (p=0.000 n=20)
IntN1000-32                    2.484n ± 1%    2.478n ± 1%        ~ (p=0.085 n=20)
Int64N1000-32                  2.502n ± 2%    2.455n ± 1%   -1.88% (p=0.002 n=20)
Int64N1e8-32                   2.484n ± 2%    2.467n ± 2%        ~ (p=0.048 n=20)
Int64N1e9-32                   2.502n ± 0%    2.454n ± 1%   -1.92% (p=0.000 n=20)
Int64N2e9-32                   2.502n ± 0%    2.482n ± 1%   -0.76% (p=0.000 n=20)
Int64N1e18-32                  3.201n ± 1%    3.349n ± 2%   +4.62% (p=0.000 n=20)
Int64N2e18-32                  3.504n ± 1%    3.537n ± 1%        ~ (p=0.185 n=20)
Int64N4e18-32                  4.873n ± 1%    4.917n ± 0%   +0.90% (p=0.000 n=20)
Int32N1000-32                  2.639n ± 1%    2.386n ± 1%   -9.57% (p=0.000 n=20)
Int32N1e8-32                   2.686n ± 2%    2.366n ± 1%  -11.91% (p=0.000 n=20)
Int32N1e9-32                   2.636n ± 1%    2.355n ± 2%  -10.70% (p=0.000 n=20)
Int32N2e9-32                   2.660n ± 1%    2.371n ± 1%  -10.88% (p=0.000 n=20)
Float32-32                     2.261n ± 1%    2.245n ± 2%        ~ (p=0.752 n=20)
Float64-32                     2.280n ± 1%    2.235n ± 1%   -1.97% (p=0.007 n=20)
ExpFloat64-32                  3.891n ± 1%    3.813n ± 3%        ~ (p=0.087 n=20)
NormFloat64-32                 3.711n ± 1%    3.652n ± 2%        ~ (p=0.021 n=20)
Perm3-32                       32.60n ± 2%    33.12n ± 3%        ~ (p=0.107 n=20)
Perm30-32                      204.2n ± 0%    205.1n ± 1%        ~ (p=0.358 n=20)
Perm30ViaShuffle-32            121.7n ± 2%    110.8n ± 1%   -8.96% (p=0.000 n=20)
ShuffleOverhead-32             106.2n ± 2%    113.0n ± 1%   +6.36% (p=0.000 n=20)
Concurrent-32                  2.190n ± 5%    2.100n ± 0%   -4.13% (p=0.001 n=20)
PCG_DXSM-32                                   1.490n ± 0%

goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
                       │ 8993506f2f.arm64 │           01ff938549.arm64           │
                       │      sec/op      │    sec/op     vs base                │
SourceUint64-8                2.271n ± 0%    2.258n ± 1%        ~ (p=0.167 n=20)
GlobalInt64-8                 2.161n ± 1%    2.167n ± 0%        ~ (p=0.693 n=20)
GlobalInt64Parallel-8        0.4303n ± 0%   0.4310n ± 0%        ~ (p=0.051 n=20)
GlobalUint64-8                2.164n ± 1%    2.182n ± 1%        ~ (p=0.042 n=20)
GlobalUint64Parallel-8       0.4287n ± 0%   0.4297n ± 0%        ~ (p=0.082 n=20)
Int64-8                       2.478n ± 1%    2.472n ± 1%        ~ (p=0.151 n=20)
Uint64-8                      2.460n ± 1%    2.449n ± 1%        ~ (p=0.013 n=20)
GlobalIntN1000-8              2.814n ± 2%    2.814n ± 2%        ~ (p=0.821 n=20)
IntN1000-8                    3.003n ± 2%    2.998n ± 2%        ~ (p=0.024 n=20)
Int64N1000-8                  2.954n ± 0%    2.949n ± 2%        ~ (p=0.192 n=20)
Int64N1e8-8                   2.956n ± 0%    2.953n ± 2%        ~ (p=0.109 n=20)
Int64N1e9-8                   3.325n ± 0%    2.950n ± 0%  -11.26% (p=0.000 n=20)
Int64N2e9-8                   2.956n ± 2%    2.946n ± 2%        ~ (p=0.027 n=20)
Int64N1e18-8                  3.780n ± 1%    3.779n ± 1%        ~ (p=0.815 n=20)
Int64N2e18-8                  4.385n ± 0%    4.370n ± 1%        ~ (p=0.402 n=20)
Int64N4e18-8                  6.527n ± 0%    6.544n ± 1%        ~ (p=0.140 n=20)
Int32N1000-8                  2.964n ± 1%    2.950n ± 0%   -0.47% (p=0.002 n=20)
Int32N1e8-8                   2.964n ± 1%    2.950n ± 2%        ~ (p=0.013 n=20)
Int32N1e9-8                   2.963n ± 2%    2.951n ± 2%        ~ (p=0.062 n=20)
Int32N2e9-8                   2.961n ± 2%    2.950n ± 2%   -0.37% (p=0.002 n=20)
Float32-8                     3.442n ± 0%    3.441n ± 0%        ~ (p=0.211 n=20)
Float64-8                     3.442n ± 0%    3.442n ± 0%        ~ (p=0.067 n=20)
ExpFloat64-8                  4.472n ± 0%    4.481n ± 0%   +0.20% (p=0.000 n=20)
NormFloat64-8                 4.734n ± 0%    4.725n ± 0%   -0.19% (p=0.003 n=20)
Perm3-8                       26.55n ± 0%    26.55n ± 0%        ~ (p=0.833 n=20)
Perm30-8                      181.9n ± 0%    181.9n ± 0%   -0.03% (p=0.004 n=20)
Perm30ViaShuffle-8            143.1n ± 0%    142.9n ± 0%        ~ (p=0.204 n=20)
ShuffleOverhead-8             120.6n ± 1%    120.8n ± 2%        ~ (p=0.102 n=20)
Concurrent-8                  2.357n ± 2%    2.421n ± 6%        ~ (p=0.016 n=20)
PCG_DXSM-8                                   2.531n ± 0%

goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 8993506f2f.386 │           01ff938549.386            │
                        │     sec/op     │    sec/op     vs base               │
SourceUint64-32              2.102n ± 2%    2.069n ± 0%       ~ (p=0.021 n=20)
GlobalInt64-32               3.542n ± 2%    3.456n ± 1%  -2.44% (p=0.001 n=20)
GlobalInt64Parallel-32      0.3202n ± 0%   0.3252n ± 0%  +1.56% (p=0.000 n=20)
GlobalUint64-32              3.507n ± 1%    3.573n ± 1%  +1.87% (p=0.000 n=20)
GlobalUint64Parallel-32     0.3170n ± 1%   0.3159n ± 0%       ~ (p=0.167 n=20)
Int64-32                     2.516n ± 1%    2.562n ± 2%       ~ (p=0.016 n=20)
Uint64-32                    2.544n ± 1%    2.592n ± 0%  +1.85% (p=0.000 n=20)
GlobalIntN1000-32            6.237n ± 1%    6.266n ± 2%       ~ (p=0.268 n=20)
IntN1000-32                  4.670n ± 2%    4.724n ± 2%       ~ (p=0.644 n=20)
Int64N1000-32                5.412n ± 1%    5.490n ± 2%       ~ (p=0.159 n=20)
Int64N1e8-32                 5.414n ± 2%    5.513n ± 2%       ~ (p=0.129 n=20)
Int64N1e9-32                 5.473n ± 1%    5.476n ± 1%       ~ (p=0.723 n=20)
Int64N2e9-32                 5.487n ± 1%    5.501n ± 2%       ~ (p=0.481 n=20)
Int64N1e18-32                8.901n ± 2%    9.043n ± 2%       ~ (p=0.330 n=20)
Int64N2e18-32                9.521n ± 1%    9.601n ± 2%       ~ (p=0.703 n=20)
Int64N4e18-32                11.92n ± 1%    12.00n ± 1%       ~ (p=0.489 n=20)
Int32N1000-32                4.785n ± 1%    4.829n ± 2%       ~ (p=0.402 n=20)
Int32N1e8-32                 4.748n ± 1%    4.825n ± 2%       ~ (p=0.218 n=20)
Int32N1e9-32                 4.810n ± 1%    4.830n ± 2%       ~ (p=0.794 n=20)
Int32N2e9-32                 4.812n ± 1%    4.750n ± 2%       ~ (p=0.057 n=20)
Float32-32                   10.48n ± 4%    10.89n ± 4%       ~ (p=0.162 n=20)
Float64-32                   19.79n ± 3%    19.60n ± 4%       ~ (p=0.668 n=20)
ExpFloat64-32                12.91n ± 3%    12.96n ± 3%       ~ (p=1.000 n=20)
NormFloat64-32               7.462n ± 1%    7.516n ± 1%       ~ (p=0.051 n=20)
Perm3-32                     35.98n ± 2%    36.78n ± 2%       ~ (p=0.033 n=20)
Perm30-32                    241.5n ± 1%    238.9n ± 2%       ~ (p=0.126 n=20)
Perm30ViaShuffle-32          187.3n ± 2%    189.7n ± 2%       ~ (p=0.387 n=20)
ShuffleOverhead-32           160.2n ± 1%    159.8n ± 1%       ~ (p=0.256 n=20)
Concurrent-32                3.308n ± 3%    3.286n ± 1%       ~ (p=0.038 n=20)
PCG_DXSM-32                                 7.613n ± 1%

For #61716.

Change-Id: Icb274ca1f782504d658305a40159b4ae6a2f3f1d
Reviewed-on: https://go-review.googlesource.com/c/go/+/502505
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Rob Pike <r@golang.org>
2023-10-30 17:09:23 +00:00
Russ Cox c266587846 math/rand/v2: add, optimize N, UintN, Uint32N, Uint64N
Now that we can break the value stream, we can take advantage
of better algorithms that have been suggested since the original
code was written.

Also optimizes IntN, Int32N, Int64N, Perm (indirectly).

All the N variants (IntN, Int32N, Int64N, UintN, N, etc) now
return the same values given a Source and parameter n, so that
for example uint(r.IntN(10)) and r.UintN(10) and r.N(uint(10))
are completely interchangeable.

Int64N4e18 gets slower but that is a near worst case for
the algorithm and is extremely unlikely in practice.

32-bit Int32N variants got slower too, by 15-30%, in exchange
for speeding up everything on 64-bit systems and consistency
across the N functions.

Also rename previously missed benchmark
GlobalInt63Parallel to GlobalInt64Parallel.

goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 11ad9fdddc.amd64 │            4d84a369d1.amd64            │
                        │      sec/op      │    sec/op     vs base                  │
SourceUint64-32                1.335n ± 1%    1.348n ± 2%        ~ (p=0.335 n=20)
GlobalInt64-32                 2.046n ± 1%    2.082n ± 2%        ~ (p=0.310 n=20)
GlobalInt63Parallel-32        0.1037n ± 1%
GlobalInt64Parallel-32                       0.1036n ± 1%
GlobalUint64-32                2.075n ± 0%    2.077n ± 2%        ~ (p=0.228 n=20)
GlobalUint64Parallel-32       0.1013n ± 1%   0.1012n ± 1%        ~ (p=0.878 n=20)
Int64-32                       1.726n ± 2%    1.750n ± 0%   +1.39% (p=0.000 n=20)
Uint64-32                      1.673n ± 1%    1.707n ± 2%   +2.03% (p=0.002 n=20)
GlobalIntN1000-32              3.895n ± 2%    3.192n ± 1%  -18.05% (p=0.000 n=20)
IntN1000-32                    3.403n ± 1%    2.462n ± 2%  -27.65% (p=0.000 n=20)
Int64N1000-32                  3.053n ± 2%    2.470n ± 1%  -19.11% (p=0.000 n=20)
Int64N1e8-32                   2.718n ± 1%    2.503n ± 2%   -7.91% (p=0.000 n=20)
Int64N1e9-32                   2.712n ± 1%    2.487n ± 1%   -8.31% (p=0.000 n=20)
Int64N2e9-32                   2.690n ± 1%    2.487n ± 1%   -7.57% (p=0.000 n=20)
Int64N1e18-32                  3.084n ± 2%    3.006n ± 2%   -2.53% (p=0.000 n=20)
Int64N2e18-32                  4.026n ± 1%    3.368n ± 1%  -16.33% (p=0.000 n=20)
Int64N4e18-32                  4.049n ± 2%    4.763n ± 1%  +17.62% (p=0.000 n=20)
Int32N1000-32                  2.730n ± 0%    2.403n ± 1%  -11.94% (p=0.000 n=20)
Int32N1e8-32                   2.916n ± 2%    2.405n ± 1%  -17.53% (p=0.000 n=20)
Int32N1e9-32                   3.375n ± 1%    2.402n ± 2%  -28.83% (p=0.000 n=20)
Int32N2e9-32                   3.292n ± 1%    2.384n ± 1%  -27.58% (p=0.000 n=20)
Float32-32                     2.673n ± 1%    2.641n ± 2%        ~ (p=0.147 n=20)
Float64-32                     2.485n ± 1%    2.483n ± 1%        ~ (p=0.804 n=20)
ExpFloat64-32                  3.577n ± 2%    3.486n ± 2%   -2.57% (p=0.000 n=20)
NormFloat64-32                 3.797n ± 2%    3.648n ± 1%   -3.92% (p=0.000 n=20)
Perm3-32                       35.79n ± 2%    33.04n ± 1%   -7.68% (p=0.000 n=20)
Perm30-32                      205.1n ± 1%    171.9n ± 1%  -16.14% (p=0.000 n=20)
Perm30ViaShuffle-32            111.2n ± 2%    100.3n ± 1%   -9.76% (p=0.000 n=20)
ShuffleOverhead-32             100.5n ± 2%    102.5n ± 1%   +1.99% (p=0.007 n=20)
Concurrent-32                  2.188n ± 5%    2.101n ± 0%        ~ (p=0.013 n=20)

goos: darwin
goarch: arm64
pkg: math/rand/v2
cpu: Apple M1
                       │ 11ad9fdddc.arm64 │            4d84a369d1.arm64            │
                       │      sec/op      │    sec/op     vs base                  │
SourceUint64-8                2.272n ± 1%    2.261n ± 1%        ~ (p=0.172 n=20)
GlobalInt64-8                 2.155n ± 1%    2.160n ± 1%        ~ (p=0.482 n=20)
GlobalInt63Parallel-8        0.4352n ± 0%
GlobalInt64Parallel-8                       0.4299n ± 0%
GlobalUint64-8                2.173n ± 1%    2.169n ± 1%        ~ (p=0.262 n=20)
GlobalUint64Parallel-8       0.4340n ± 0%   0.4293n ± 1%   -1.08% (p=0.000 n=20)
Int64-8                       2.544n ± 1%    2.473n ± 1%   -2.83% (p=0.000 n=20)
Uint64-8                      2.552n ± 1%    2.453n ± 1%   -3.90% (p=0.000 n=20)
GlobalIntN1000-8              3.856n ± 0%    2.814n ± 2%  -27.02% (p=0.000 n=20)
IntN1000-8                    3.820n ± 0%    2.933n ± 2%  -23.22% (p=0.000 n=20)
Int64N1000-8                  3.219n ± 2%    2.934n ± 2%   -8.85% (p=0.000 n=20)
Int64N1e8-8                   3.221n ± 2%    2.935n ± 2%   -8.91% (p=0.000 n=20)
Int64N1e9-8                   3.276n ± 2%    2.934n ± 2%  -10.44% (p=0.000 n=20)
Int64N2e9-8                   3.217n ± 0%    2.935n ± 2%   -8.78% (p=0.000 n=20)
Int64N1e18-8                  3.502n ± 2%    3.778n ± 1%   +7.91% (p=0.000 n=20)
Int64N2e18-8                  4.968n ± 1%    4.359n ± 1%  -12.26% (p=0.000 n=20)
Int64N4e18-8                  4.963n ± 0%    6.546n ± 1%  +31.92% (p=0.000 n=20)
Int32N1000-8                  3.189n ± 1%    2.940n ± 2%   -7.81% (p=0.000 n=20)
Int32N1e8-8                   3.514n ± 1%    2.937n ± 2%  -16.41% (p=0.000 n=20)
Int32N1e9-8                   4.133n ± 0%    2.938n ± 0%  -28.91% (p=0.000 n=20)
Int32N2e9-8                   4.137n ± 0%    2.938n ± 2%  -28.97% (p=0.000 n=20)
Float32-8                     3.468n ± 1%    3.486n ± 0%   +0.52% (p=0.000 n=20)
Float64-8                     3.478n ± 0%    3.480n ± 0%        ~ (p=0.063 n=20)
ExpFloat64-8                  4.563n ± 0%    4.533n ± 0%   -0.67% (p=0.000 n=20)
NormFloat64-8                 4.768n ± 0%    4.764n ± 0%   -0.07% (p=0.001 n=20)
Perm3-8                       28.94n ± 0%    26.66n ± 0%   -7.88% (p=0.000 n=20)
Perm30-8                      175.9n ± 0%    143.4n ± 0%  -18.50% (p=0.000 n=20)
Perm30ViaShuffle-8            152.6n ± 1%    142.9n ± 0%   -6.29% (p=0.000 n=20)
ShuffleOverhead-8             119.6n ± 1%    120.7n ± 0%   +0.96% (p=0.000 n=20)
Concurrent-8                  2.452n ± 3%    2.360n ± 2%   -3.73% (p=0.007 n=20)

goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 11ad9fdddc.386 │             4d84a369d1.386             │
                        │     sec/op     │    sec/op     vs base                  │
SourceUint64-32              2.091n ± 1%    2.101n ± 2%        ~ (p=0.672 n=20)
GlobalInt64-32               3.514n ± 2%    3.518n ± 2%        ~ (p=0.723 n=20)
GlobalInt63Parallel-32      0.3197n ± 0%
GlobalInt64Parallel-32                     0.3206n ± 0%
GlobalUint64-32              3.542n ± 1%    3.538n ± 1%        ~ (p=0.304 n=20)
GlobalUint64Parallel-32     0.3218n ± 0%   0.3231n ± 0%        ~ (p=0.071 n=20)
Int64-32                     2.552n ± 2%    2.554n ± 2%        ~ (p=0.693 n=20)
Uint64-32                    2.566n ± 1%    2.575n ± 2%        ~ (p=0.606 n=20)
GlobalIntN1000-32            5.965n ± 2%    6.292n ± 1%   +5.46% (p=0.000 n=20)
IntN1000-32                  4.652n ± 1%    4.735n ± 1%   +1.77% (p=0.000 n=20)
Int64N1000-32               14.485n ± 1%    5.489n ± 2%  -62.11% (p=0.000 n=20)
Int64N1e8-32                14.675n ± 1%    5.528n ± 2%  -62.33% (p=0.000 n=20)
Int64N1e9-32                16.805n ± 2%    5.438n ± 2%  -67.64% (p=0.000 n=20)
Int64N2e9-32                14.515n ± 1%    5.474n ± 1%  -62.28% (p=0.000 n=20)
Int64N1e18-32               16.165n ± 1%    9.053n ± 1%  -44.00% (p=0.000 n=20)
Int64N2e18-32               17.945n ± 2%    9.685n ± 2%  -46.03% (p=0.000 n=20)
Int64N4e18-32                18.35n ± 2%    12.18n ± 1%  -33.62% (p=0.000 n=20)
Int32N1000-32                3.608n ± 1%    4.862n ± 1%  +34.77% (p=0.000 n=20)
Int32N1e8-32                 3.767n ± 1%    4.758n ± 2%  +26.31% (p=0.000 n=20)
Int32N1e9-32                 4.130n ± 2%    4.772n ± 1%  +15.54% (p=0.000 n=20)
Int32N2e9-32                 4.206n ± 1%    4.847n ± 0%  +15.24% (p=0.000 n=20)
Float32-32                   22.18n ± 4%    22.18n ± 4%        ~ (p=0.195 n=20)
Float64-32                   20.75n ± 4%    21.21n ± 3%        ~ (p=0.394 n=20)
ExpFloat64-32                12.58n ± 3%    12.39n ± 2%        ~ (p=0.032 n=20)
NormFloat64-32               7.920n ± 3%    7.422n ± 1%   -6.29% (p=0.000 n=20)
Perm3-32                     40.27n ± 1%    38.00n ± 2%   -5.65% (p=0.000 n=20)
Perm30-32                    213.2n ± 2%    212.7n ± 1%        ~ (p=0.995 n=20)
Perm30ViaShuffle-32          164.2n ± 2%    187.5n ± 2%  +14.22% (p=0.000 n=20)
ShuffleOverhead-32           134.7n ± 2%    159.7n ± 1%  +18.52% (p=0.000 n=20)
Concurrent-32                3.301n ± 2%    3.470n ± 0%   +5.10% (p=0.000 n=20)

For #61716.

Change-Id: Id1481b04202883cd0b23e21bb58d1bca4e482bd3
Reviewed-on: https://go-review.googlesource.com/c/go/+/502500
Reviewed-by: Rob Pike <r@golang.org>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-30 17:08:37 +00:00
Russ Cox c7dddb02d3 math/rand/v2: change Source to use uint64
This should make Uint64-using functions faster and leave
other things alone. It is a mystery why so much got faster.
A good cautionary tale not to read too much into minor
jitter in the benchmarks.

goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 220860f76f.amd64 │           11ad9fdddc.amd64           │
                        │      sec/op      │    sec/op     vs base                │
SourceUint64-32                1.555n ± 1%    1.335n ± 1%  -14.15% (p=0.000 n=20)
GlobalInt64-32                 2.071n ± 1%    2.046n ± 1%        ~ (p=0.016 n=20)
GlobalInt63Parallel-32        0.1023n ± 1%   0.1037n ± 1%   +1.37% (p=0.002 n=20)
GlobalUint64-32                5.193n ± 1%    2.075n ± 0%  -60.06% (p=0.000 n=20)
GlobalUint64Parallel-32       0.2341n ± 0%   0.1013n ± 1%  -56.74% (p=0.000 n=20)
Int64-32                       2.056n ± 2%    1.726n ± 2%  -16.10% (p=0.000 n=20)
Uint64-32                      2.077n ± 2%    1.673n ± 1%  -19.46% (p=0.000 n=20)
GlobalIntN1000-32              4.077n ± 2%    3.895n ± 2%   -4.45% (p=0.000 n=20)
IntN1000-32                    3.476n ± 2%    3.403n ± 1%   -2.10% (p=0.000 n=20)
Int64N1000-32                  3.059n ± 1%    3.053n ± 2%        ~ (p=0.131 n=20)
Int64N1e8-32                   2.942n ± 1%    2.718n ± 1%   -7.60% (p=0.000 n=20)
Int64N1e9-32                   2.932n ± 1%    2.712n ± 1%   -7.50% (p=0.000 n=20)
Int64N2e9-32                   2.925n ± 1%    2.690n ± 1%   -8.03% (p=0.000 n=20)
Int64N1e18-32                  3.116n ± 1%    3.084n ± 2%        ~ (p=0.425 n=20)
Int64N2e18-32                  4.067n ± 1%    4.026n ± 1%   -1.02% (p=0.007 n=20)
Int64N4e18-32                  4.054n ± 1%    4.049n ± 2%        ~ (p=0.204 n=20)
Int32N1000-32                  2.951n ± 1%    2.730n ± 0%   -7.49% (p=0.000 n=20)
Int32N1e8-32                   3.102n ± 1%    2.916n ± 2%   -6.03% (p=0.000 n=20)
Int32N1e9-32                   3.535n ± 1%    3.375n ± 1%   -4.54% (p=0.000 n=20)
Int32N2e9-32                   3.514n ± 1%    3.292n ± 1%   -6.30% (p=0.000 n=20)
Float32-32                     2.760n ± 1%    2.673n ± 1%   -3.13% (p=0.000 n=20)
Float64-32                     2.284n ± 1%    2.485n ± 1%   +8.80% (p=0.000 n=20)
ExpFloat64-32                  3.757n ± 1%    3.577n ± 2%   -4.78% (p=0.000 n=20)
NormFloat64-32                 3.837n ± 1%    3.797n ± 2%        ~ (p=0.204 n=20)
Perm3-32                       35.23n ± 2%    35.79n ± 2%        ~ (p=0.298 n=20)
Perm30-32                      208.8n ± 1%    205.1n ± 1%   -1.82% (p=0.000 n=20)
Perm30ViaShuffle-32            111.7n ± 1%    111.2n ± 2%        ~ (p=0.273 n=20)
ShuffleOverhead-32             101.1n ± 1%    100.5n ± 2%        ~ (p=0.878 n=20)
Concurrent-32                  2.108n ± 7%    2.188n ± 5%        ~ (p=0.417 n=20)

goos: darwin
goarch: arm64
pkg: math/rand/v2
                       │ 220860f76f.arm64 │           11ad9fdddc.arm64           │
                       │      sec/op      │    sec/op     vs base                │
SourceUint64-8                2.316n ± 1%    2.272n ± 1%   -1.86% (p=0.000 n=20)
GlobalInt64-8                 2.183n ± 1%    2.155n ± 1%        ~ (p=0.122 n=20)
GlobalInt63Parallel-8        0.4331n ± 0%   0.4352n ± 0%   +0.48% (p=0.000 n=20)
GlobalUint64-8                4.377n ± 2%    2.173n ± 1%  -50.35% (p=0.000 n=20)
GlobalUint64Parallel-8       0.9237n ± 0%   0.4340n ± 0%  -53.02% (p=0.000 n=20)
Int64-8                       2.538n ± 1%    2.544n ± 1%        ~ (p=0.189 n=20)
Uint64-8                      2.604n ± 1%    2.552n ± 1%   -1.98% (p=0.000 n=20)
GlobalIntN1000-8              3.857n ± 2%    3.856n ± 0%        ~ (p=0.051 n=20)
IntN1000-8                    3.822n ± 2%    3.820n ± 0%   -0.05% (p=0.001 n=20)
Int64N1000-8                  3.318n ± 0%    3.219n ± 2%   -2.98% (p=0.000 n=20)
Int64N1e8-8                   3.349n ± 1%    3.221n ± 2%   -3.79% (p=0.000 n=20)
Int64N1e9-8                   3.317n ± 2%    3.276n ± 2%   -1.24% (p=0.001 n=20)
Int64N2e9-8                   3.317n ± 2%    3.217n ± 0%   -3.01% (p=0.000 n=20)
Int64N1e18-8                  3.542n ± 1%    3.502n ± 2%   -1.16% (p=0.001 n=20)
Int64N2e18-8                  5.087n ± 0%    4.968n ± 1%   -2.33% (p=0.000 n=20)
Int64N4e18-8                  5.084n ± 0%    4.963n ± 0%   -2.39% (p=0.000 n=20)
Int32N1000-8                  3.208n ± 2%    3.189n ± 1%   -0.58% (p=0.001 n=20)
Int32N1e8-8                   3.610n ± 1%    3.514n ± 1%   -2.67% (p=0.000 n=20)
Int32N1e9-8                   4.235n ± 0%    4.133n ± 0%   -2.40% (p=0.000 n=20)
Int32N2e9-8                   4.229n ± 1%    4.137n ± 0%   -2.19% (p=0.000 n=20)
Float32-8                     3.468n ± 0%    3.468n ± 1%        ~ (p=0.350 n=20)
Float64-8                     3.447n ± 0%    3.478n ± 0%   +0.90% (p=0.000 n=20)
ExpFloat64-8                  4.567n ± 0%    4.563n ± 0%   -0.10% (p=0.002 n=20)
NormFloat64-8                 4.821n ± 0%    4.768n ± 0%   -1.09% (p=0.000 n=20)
Perm3-8                       28.89n ± 0%    28.94n ± 0%   +0.17% (p=0.000 n=20)
Perm30-8                      175.7n ± 0%    175.9n ± 0%   +0.14% (p=0.000 n=20)
Perm30ViaShuffle-8            153.5n ± 0%    152.6n ± 1%        ~ (p=0.010 n=20)
ShuffleOverhead-8             119.8n ± 1%    119.6n ± 1%        ~ (p=0.147 n=20)
Concurrent-8                  2.433n ± 3%    2.452n ± 3%        ~ (p=0.616 n=20)

goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ 220860f76f.386 │            11ad9fdddc.386            │
                        │     sec/op     │    sec/op     vs base                │
SourceUint64-32             2.370n ±  1%    2.091n ± 1%  -11.75% (p=0.000 n=20)
GlobalInt64-32              3.569n ±  1%    3.514n ± 2%   -1.56% (p=0.000 n=20)
GlobalInt63Parallel-32     0.3221n ±  1%   0.3197n ± 0%   -0.76% (p=0.000 n=20)
GlobalUint64-32             8.797n ± 10%    3.542n ± 1%  -59.74% (p=0.000 n=20)
GlobalUint64Parallel-32    0.6351n ±  0%   0.3218n ± 0%  -49.33% (p=0.000 n=20)
Int64-32                    2.612n ±  2%    2.552n ± 2%   -2.30% (p=0.000 n=20)
Uint64-32                   3.350n ±  1%    2.566n ± 1%  -23.42% (p=0.000 n=20)
GlobalIntN1000-32           5.892n ±  1%    5.965n ± 2%        ~ (p=0.082 n=20)
IntN1000-32                 4.546n ±  1%    4.652n ± 1%   +2.33% (p=0.000 n=20)
Int64N1000-32               14.59n ±  1%    14.48n ± 1%        ~ (p=0.652 n=20)
Int64N1e8-32                14.76n ±  2%    14.67n ± 1%        ~ (p=0.836 n=20)
Int64N1e9-32                16.57n ±  1%    16.80n ± 2%        ~ (p=0.016 n=20)
Int64N2e9-32                14.54n ±  1%    14.52n ± 1%        ~ (p=0.533 n=20)
Int64N1e18-32               16.14n ±  1%    16.16n ± 1%        ~ (p=0.606 n=20)
Int64N2e18-32               18.10n ±  1%    17.95n ± 2%        ~ (p=0.062 n=20)
Int64N4e18-32               18.65n ±  1%    18.35n ± 2%   -1.61% (p=0.010 n=20)
Int32N1000-32               3.560n ±  1%    3.608n ± 1%   +1.33% (p=0.001 n=20)
Int32N1e8-32                3.770n ±  2%    3.767n ± 1%        ~ (p=0.155 n=20)
Int32N1e9-32                4.098n ±  0%    4.130n ± 2%        ~ (p=0.016 n=20)
Int32N2e9-32                4.179n ±  1%    4.206n ± 1%        ~ (p=0.011 n=20)
Float32-32                  21.18n ±  4%    22.18n ± 4%   +4.70% (p=0.003 n=20)
Float64-32                  20.60n ±  2%    20.75n ± 4%   +0.73% (p=0.000 n=20)
ExpFloat64-32               13.07n ±  0%    12.58n ± 3%   -3.82% (p=0.000 n=20)
NormFloat64-32              7.738n ±  2%    7.920n ± 3%        ~ (p=0.066 n=20)
Perm3-32                    36.73n ±  1%    40.27n ± 1%   +9.65% (p=0.000 n=20)
Perm30-32                   211.9n ±  1%    213.2n ± 2%        ~ (p=0.262 n=20)
Perm30ViaShuffle-32         165.2n ±  1%    164.2n ± 2%        ~ (p=0.029 n=20)
ShuffleOverhead-32          133.9n ±  1%    134.7n ± 2%        ~ (p=0.551 n=20)
Concurrent-32               3.287n ±  2%    3.301n ± 2%        ~ (p=0.330 n=20)

For #61716.

Change-Id: I8d2f73f87dd3603a0c2ff069988938e0957b6904
Reviewed-on: https://go-review.googlesource.com/c/go/+/502499
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Rob Pike <r@golang.org>
2023-10-30 17:08:34 +00:00
Russ Cox 1cc5b34d28 math/rand/v2: remove Rand.Seed
Removing Rand.Seed lets us remove lockedSource as well,
along with the ambiguity in globalRand about which source
to use.

For #61716.

Change-Id: Ibe150520dd1e7dd87165eacaebe9f0c2daeaedfd
Reviewed-on: https://go-review.googlesource.com/c/go/+/502498
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Rob Pike <r@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
2023-10-30 14:31:46 +00:00
Russ Cox d6c1ef52ad math/rand/v2: remove Read
In math/rand, Read is deprecated. Remove in v2.
People should use crypto/rand if they need long strings.

For #61716.

Change-Id: Ib254b7e1844616e96db60a3a7abb572b0dcb1583
Reviewed-on: https://go-review.googlesource.com/c/go/+/502497
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-30 14:30:14 +00:00
Russ Cox d42750b17c math/rand/v2: rename various functions
Int31 -> Int32
Int31n -> Int32N
Int63 -> Int64
Int63n -> Int64N
Intn -> IntN

The 31 and 63 are pedantic and confusing: the functions should
be named for the type they return, same as all the others.

The lower-case n is inconsistent with Go's usual CamelCase
and especially problematic because we plan to add 'func N'.
Capitalize the n.

For #61716.

Change-Id: Idb1a005a82f353677450d47fb612ade7a41fde69
Reviewed-on: https://go-review.googlesource.com/c/go/+/516857
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-30 14:29:37 +00:00
Russ Cox 59f0ab4036 math/rand/v2: start of new API
This is the beginning of the math/rand/v2 package from proposal #61716.
Start by copying old API. This CL copies math/rand/* to math/rand/v2
and updates references to math/rand to add v2 throughout.
Later CLs will make the v2 changes.

For #61716.

Change-Id: I1624ccffae3dfa442d4ba2461942decbd076e11b
Reviewed-on: https://go-review.googlesource.com/c/go/+/502495
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
2023-10-30 14:29:30 +00:00