mirror of https://github.com/golang/go.git
444 Commits
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
11ce6eabd6 |
math/bits: remove named return in TrailingZeros16
TrailingZeros16 is the only one of the TrailingZeros functions with a named return value in the signature. This creates a sligthly unpleasant effect in the godoc listing: func TrailingZeros(x uint) int func TrailingZeros16(x uint16) (n int) func TrailingZeros32(x uint32) int func TrailingZeros64(x uint64) int func TrailingZeros8(x uint8) int Since the named return value is not even used, remove it. Change-Id: I15c5aedb6157003911b6e0685c357ce56e466c0e Reviewed-on: https://go-review.googlesource.com/c/153340 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
276870d6e0 |
math: document sign bit correspondence for floating-point/bits conversions
Fixes #27736. Change-Id: Ibda7da7ec6e731626fc43abf3e8c1190117f7885 Reviewed-on: https://go-review.googlesource.com/c/153057 Reviewed-by: Ian Lance Taylor <iant@golang.org> |
|
|
|
bfc54bb6f3 |
math/big: allocate less for single-Word nats
For many uses of math/big, most numbers are small in practice. Prior to this change, big.NewInt allocated a minimum of five Words: one to hold the value, and four as extra capacity. In most cases, this extra capacity is waste. Worse, allocating a single Word uses a fast malloc path for tiny allocs; allocating five Words is more expensive in CPU as well as memory. This change is a simple fix: Treat a request for one Word at its word. I experimented with more complicated fixes and did not find anything that outperformed this easy fix. On some real world programs, this is a clear win. The compiler: name old alloc/op new alloc/op delta Template 37.1MB ± 0% 37.0MB ± 0% -0.23% (p=0.008 n=5+5) Unicode 29.2MB ± 0% 28.5MB ± 0% -2.48% (p=0.008 n=5+5) GoTypes 133MB ± 0% 133MB ± 0% -0.05% (p=0.008 n=5+5) Compiler 628MB ± 0% 628MB ± 0% -0.06% (p=0.008 n=5+5) SSA 2.04GB ± 0% 2.03GB ± 0% -0.14% (p=0.008 n=5+5) Flate 24.7MB ± 0% 24.6MB ± 0% -0.23% (p=0.008 n=5+5) GoParser 29.6MB ± 0% 29.6MB ± 0% -0.07% (p=0.008 n=5+5) Reflect 82.3MB ± 0% 82.2MB ± 0% -0.05% (p=0.008 n=5+5) Tar 36.2MB ± 0% 36.2MB ± 0% -0.12% (p=0.008 n=5+5) XML 49.5MB ± 0% 49.4MB ± 0% -0.23% (p=0.008 n=5+5) [Geo mean] 85.1MB 84.8MB -0.37% name old allocs/op new allocs/op delta Template 364k ± 0% 364k ± 0% ~ (p=0.476 n=5+5) Unicode 341k ± 0% 341k ± 0% ~ (p=0.690 n=5+5) GoTypes 1.37M ± 0% 1.37M ± 0% ~ (p=0.444 n=5+5) Compiler 5.50M ± 0% 5.50M ± 0% +0.02% (p=0.008 n=5+5) SSA 16.0M ± 0% 16.0M ± 0% +0.01% (p=0.008 n=5+5) Flate 238k ± 0% 238k ± 0% ~ (p=0.222 n=5+5) GoParser 305k ± 0% 305k ± 0% ~ (p=0.841 n=5+5) Reflect 976k ± 0% 976k ± 0% ~ (p=0.222 n=5+5) Tar 354k ± 0% 354k ± 0% ~ (p=0.103 n=5+5) XML 450k ± 0% 450k ± 0% ~ (p=0.151 n=5+5) [Geo mean] 837k 837k +0.01% go.skylark.net (at ea6d2813de75ded8d157b9540bc3d3ad0b688623): name old alloc/op new alloc/op delta Hashtable-8 456kB ± 0% 299kB ± 0% -34.33% (p=0.000 n=9+9) /bench_builtin_method-8 220kB ± 0% 190kB ± 0% -13.55% (p=0.000 n=9+10) name old allocs/op new allocs/op delta Hashtable-8 7.84k ± 0% 7.84k ± 0% ~ (all equal) /bench_builtin_method-8 7.49k ± 0% 7.49k ± 0% ~ (all equal) The math/big benchmarks are messy, which is predictable, since they naturally exercise the bigger-than-one-word code more. Also worth noting is that many of the benchmarks have very high variance. I've omitted the opVV and opVW benchmarks, as they are unrelated. name old time/op new time/op delta DecimalConversion-8 92.5µs ± 1% 90.6µs ± 0% -2.12% (p=0.000 n=17+19) FloatString/100-8 867ns ± 0% 871ns ± 0% +0.50% (p=0.000 n=18+18) FloatString/1000-8 26.4µs ± 1% 26.5µs ± 1% ~ (p=0.396 n=20+19) FloatString/10000-8 2.15ms ± 2% 2.16ms ± 2% ~ (p=0.089 n=19+20) FloatString/100000-8 209ms ± 1% 209ms ± 1% ~ (p=0.583 n=19+19) FloatAdd/10-8 63.5ns ± 2% 64.1ns ± 6% ~ (p=0.389 n=19+19) FloatAdd/100-8 66.0ns ± 2% 65.8ns ± 2% ~ (p=0.825 n=20+20) FloatAdd/1000-8 93.9ns ± 1% 94.3ns ± 1% ~ (p=0.273 n=19+20) FloatAdd/10000-8 347ns ± 2% 342ns ± 1% -1.50% (p=0.000 n=18+18) FloatAdd/100000-8 2.78µs ± 1% 2.78µs ± 2% ~ (p=0.961 n=20+19) FloatSub/10-8 56.9ns ± 2% 57.8ns ± 3% +1.59% (p=0.001 n=19+19) FloatSub/100-8 58.2ns ± 2% 58.9ns ± 2% +1.25% (p=0.004 n=20+20) FloatSub/1000-8 74.9ns ± 1% 74.4ns ± 1% -0.76% (p=0.000 n=19+20) FloatSub/10000-8 223ns ± 1% 220ns ± 2% -1.29% (p=0.000 n=16+20) FloatSub/100000-8 1.66µs ± 1% 1.66µs ± 2% ~ (p=0.147 n=20+20) ParseFloatSmallExp-8 8.38µs ± 0% 8.59µs ± 0% +2.48% (p=0.000 n=19+19) ParseFloatLargeExp-8 31.1µs ± 0% 32.0µs ± 0% +3.04% (p=0.000 n=16+17) GCD10x10/WithoutXY-8 115ns ± 1% 99ns ± 3% -14.07% (p=0.000 n=20+20) GCD10x10/WithXY-8 322ns ± 0% 312ns ± 0% -3.11% (p=0.000 n=18+13) GCD10x100/WithoutXY-8 233ns ± 1% 219ns ± 1% -5.73% (p=0.000 n=19+17) GCD10x100/WithXY-8 709ns ± 0% 759ns ± 0% +7.04% (p=0.000 n=19+19) GCD10x1000/WithoutXY-8 653ns ± 1% 642ns ± 1% -1.69% (p=0.000 n=17+20) GCD10x1000/WithXY-8 1.35µs ± 0% 1.35µs ± 1% ~ (p=0.255 n=20+16) GCD10x10000/WithoutXY-8 4.57µs ± 1% 4.61µs ± 1% +0.95% (p=0.000 n=18+17) GCD10x10000/WithXY-8 6.82µs ± 0% 6.84µs ± 0% +0.27% (p=0.000 n=16+17) GCD10x100000/WithoutXY-8 43.9µs ± 1% 44.0µs ± 1% +0.28% (p=0.000 n=18+17) GCD10x100000/WithXY-8 60.6µs ± 0% 60.6µs ± 0% ~ (p=0.907 n=18+18) GCD100x100/WithoutXY-8 1.13µs ± 0% 1.21µs ± 0% +6.39% (p=0.000 n=19+19) GCD100x100/WithXY-8 1.82µs ± 0% 1.92µs ± 0% +5.24% (p=0.000 n=19+17) GCD100x1000/WithoutXY-8 2.00µs ± 0% 2.03µs ± 1% +1.61% (p=0.000 n=18+16) GCD100x1000/WithXY-8 3.22µs ± 0% 3.20µs ± 1% -0.83% (p=0.000 n=19+19) GCD100x10000/WithoutXY-8 9.28µs ± 1% 9.17µs ± 1% -1.25% (p=0.000 n=18+19) GCD100x10000/WithXY-8 13.5µs ± 0% 13.3µs ± 0% -1.12% (p=0.000 n=18+19) GCD100x100000/WithoutXY-8 80.4µs ± 0% 78.6µs ± 0% -2.25% (p=0.000 n=19+19) GCD100x100000/WithXY-8 114µs ± 0% 112µs ± 0% -1.46% (p=0.000 n=19+17) GCD1000x1000/WithoutXY-8 12.9µs ± 1% 12.9µs ± 2% -0.50% (p=0.014 n=20+19) GCD1000x1000/WithXY-8 19.6µs ± 1% 19.6µs ± 2% -0.28% (p=0.040 n=17+18) GCD1000x10000/WithoutXY-8 22.4µs ± 0% 22.4µs ± 2% ~ (p=0.220 n=19+19) GCD1000x10000/WithXY-8 57.0µs ± 0% 56.5µs ± 0% -0.87% (p=0.000 n=20+20) GCD1000x100000/WithoutXY-8 116µs ± 0% 115µs ± 0% -0.49% (p=0.000 n=18+19) GCD1000x100000/WithXY-8 410µs ± 0% 411µs ± 0% ~ (p=0.052 n=19+19) GCD10000x10000/WithoutXY-8 247µs ± 1% 244µs ± 1% -0.92% (p=0.000 n=19+19) GCD10000x10000/WithXY-8 476µs ± 1% 473µs ± 1% -0.48% (p=0.009 n=19+19) GCD10000x100000/WithoutXY-8 573µs ± 1% 571µs ± 1% -0.45% (p=0.012 n=20+20) GCD10000x100000/WithXY-8 3.35ms ± 1% 3.35ms ± 1% ~ (p=0.444 n=20+19) GCD100000x100000/WithoutXY-8 12.0ms ± 2% 11.9ms ± 2% ~ (p=0.276 n=18+20) GCD100000x100000/WithXY-8 27.3ms ± 1% 27.3ms ± 1% ~ (p=0.792 n=20+19) Hilbert-8 672µs ± 0% 611µs ± 0% -9.02% (p=0.000 n=19+19) Binomial-8 1.40µs ± 0% 1.18µs ± 0% -15.69% (p=0.000 n=16+14) QuoRem-8 2.20µs ± 1% 2.17µs ± 1% -1.13% (p=0.000 n=19+19) Exp-8 4.10ms ± 1% 4.11ms ± 1% ~ (p=0.296 n=20+19) Exp2-8 4.11ms ± 1% 4.12ms ± 1% ~ (p=0.429 n=20+20) Bitset-8 8.67ns ± 6% 8.74ns ± 4% ~ (p=0.139 n=19+17) BitsetNeg-8 43.6ns ± 1% 43.8ns ± 2% +0.61% (p=0.036 n=20+20) BitsetOrig-8 77.5ns ± 1% 68.4ns ± 1% -11.77% (p=0.000 n=19+20) BitsetNegOrig-8 145ns ± 1% 141ns ± 1% -2.87% (p=0.000 n=19+20) ModSqrt225_Tonelli-8 324µs ± 1% 324µs ± 1% ~ (p=0.409 n=18+20) ModSqrt225_3Mod4-8 98.9µs ± 1% 99.1µs ± 1% ~ (p=0.298 n=19+18) ModSqrt231_Tonelli-8 337µs ± 1% 337µs ± 1% ~ (p=0.718 n=20+18) ModSqrt231_5Mod8-8 115µs ± 1% 114µs ± 1% -0.22% (p=0.050 n=20+20) ModInverse-8 895ns ± 0% 869ns ± 1% -2.83% (p=0.000 n=17+17) Sqrt-8 28.1µs ± 1% 28.1µs ± 0% -0.28% (p=0.000 n=16+20) IntSqr/1-8 10.8ns ± 3% 10.5ns ± 3% -2.51% (p=0.000 n=19+17) IntSqr/2-8 30.5ns ± 2% 30.3ns ± 4% -0.71% (p=0.035 n=18+18) IntSqr/3-8 40.1ns ± 1% 40.1ns ± 1% ~ (p=0.710 n=20+17) IntSqr/5-8 65.3ns ± 1% 65.4ns ± 2% ~ (p=0.744 n=19+19) IntSqr/8-8 101ns ± 1% 102ns ± 0% ~ (p=0.234 n=19+20) IntSqr/10-8 138ns ± 0% 138ns ± 2% ~ (p=0.827 n=18+18) IntSqr/20-8 378ns ± 1% 378ns ± 1% ~ (p=0.479 n=18+18) IntSqr/30-8 637ns ± 0% 638ns ± 1% ~ (p=0.051 n=18+20) IntSqr/50-8 1.34µs ± 2% 1.34µs ± 1% ~ (p=0.970 n=18+19) IntSqr/80-8 2.78µs ± 0% 2.78µs ± 1% -0.18% (p=0.006 n=19+17) IntSqr/100-8 3.98µs ± 0% 3.98µs ± 0% ~ (p=0.057 n=17+19) IntSqr/200-8 13.5µs ± 0% 13.5µs ± 1% -0.33% (p=0.000 n=19+17) IntSqr/300-8 25.3µs ± 1% 25.3µs ± 1% ~ (p=0.361 n=19+20) IntSqr/500-8 62.9µs ± 0% 62.9µs ± 1% ~ (p=0.899 n=17+17) IntSqr/800-8 128µs ± 1% 127µs ± 1% -0.32% (p=0.016 n=18+20) IntSqr/1000-8 192µs ± 0% 192µs ± 1% ~ (p=0.916 n=17+18) Div/20/10-8 34.9ns ± 2% 35.6ns ± 1% +2.01% (p=0.000 n=20+20) Div/200/100-8 218ns ± 1% 215ns ± 2% -1.43% (p=0.000 n=18+18) Div/2000/1000-8 1.16µs ± 1% 1.15µs ± 1% -1.04% (p=0.000 n=19+20) Div/20000/10000-8 35.7µs ± 1% 35.4µs ± 1% -0.69% (p=0.000 n=19+18) Div/200000/100000-8 2.89ms ± 1% 2.88ms ± 1% -0.62% (p=0.007 n=19+20) Mul-8 9.28ms ± 1% 9.27ms ± 1% ~ (p=0.563 n=18+18) ZeroShifts/Shl-8 712ns ± 6% 716ns ± 7% ~ (p=0.597 n=20+20) ZeroShifts/ShlSame-8 4.00ns ± 1% 4.06ns ± 5% ~ (p=0.162 n=18+20) ZeroShifts/Shr-8 714ns ±10% 1285ns ±156% ~ (p=0.250 n=20+20) ZeroShifts/ShrSame-8 4.00ns ± 1% 4.09ns ±10% +2.34% (p=0.048 n=16+19) Exp3Power/0x10-8 154ns ± 0% 159ns ±13% ~ (p=0.197 n=14+20) Exp3Power/0x40-8 171ns ± 1% 175ns ± 8% ~ (p=0.058 n=16+19) Exp3Power/0x100-8 287ns ± 0% 316ns ± 4% +10.03% (p=0.000 n=17+19) Exp3Power/0x400-8 698ns ± 1% 801ns ± 6% +14.75% (p=0.000 n=19+20) Exp3Power/0x1000-8 2.87µs ± 0% 3.65µs ± 6% +27.24% (p=0.000 n=18+18) Exp3Power/0x4000-8 21.9µs ± 1% 28.7µs ± 8% +31.09% (p=0.000 n=18+20) Exp3Power/0x10000-8 204µs ± 0% 267µs ± 9% +30.81% (p=0.000 n=20+20) Exp3Power/0x40000-8 1.86ms ± 0% 2.26ms ± 5% +21.68% (p=0.000 n=18+19) Exp3Power/0x100000-8 17.5ms ± 1% 20.7ms ± 7% +18.39% (p=0.000 n=19+20) Exp3Power/0x400000-8 156ms ± 0% 172ms ± 6% +10.54% (p=0.000 n=19+20) Fibo-8 26.9ms ± 1% 27.5ms ± 3% +2.32% (p=0.000 n=19+19) NatSqr/1-8 31.0ns ± 4% 39.5ns ±29% +27.25% (p=0.000 n=20+19) NatSqr/2-8 54.1ns ± 1% 69.0ns ±28% +27.52% (p=0.000 n=20+20) NatSqr/3-8 66.6ns ± 1% 83.0ns ±25% +24.59% (p=0.000 n=20+20) NatSqr/5-8 97.1ns ± 1% 119.9ns ±12% +23.50% (p=0.000 n=16+20) NatSqr/8-8 138ns ± 1% 171ns ± 9% +24.20% (p=0.000 n=19+20) NatSqr/10-8 182ns ± 0% 225ns ± 9% +23.50% (p=0.000 n=16+20) NatSqr/20-8 447ns ± 1% 624ns ± 6% +39.64% (p=0.000 n=19+19) NatSqr/30-8 736ns ± 2% 986ns ± 9% +33.94% (p=0.000 n=19+20) NatSqr/50-8 1.51µs ± 2% 1.97µs ± 9% +30.42% (p=0.000 n=20+20) NatSqr/80-8 3.03µs ± 1% 3.67µs ± 7% +21.08% (p=0.000 n=20+20) NatSqr/100-8 4.31µs ± 1% 5.20µs ± 7% +20.52% (p=0.000 n=19+20) NatSqr/200-8 14.2µs ± 0% 16.3µs ± 4% +14.92% (p=0.000 n=19+20) NatSqr/300-8 27.8µs ± 1% 33.2µs ± 7% +19.28% (p=0.000 n=20+18) NatSqr/500-8 66.6µs ± 1% 74.5µs ± 3% +11.87% (p=0.000 n=18+18) NatSqr/800-8 135µs ± 1% 165µs ± 7% +22.33% (p=0.000 n=20+20) NatSqr/1000-8 200µs ± 0% 228µs ± 3% +14.39% (p=0.000 n=19+20) NatSetBytes/8-8 8.87ns ± 4% 8.77ns ± 2% -1.17% (p=0.020 n=20+16) NatSetBytes/24-8 38.6ns ± 3% 49.5ns ±29% +28.32% (p=0.000 n=18+19) NatSetBytes/128-8 75.2ns ± 1% 120.7ns ±29% +60.60% (p=0.000 n=17+20) NatSetBytes/7-8 16.2ns ± 2% 16.5ns ± 2% +1.76% (p=0.000 n=20+20) NatSetBytes/23-8 46.5ns ± 1% 60.2ns ±24% +29.59% (p=0.000 n=20+20) NatSetBytes/127-8 83.1ns ± 1% 118.2ns ±20% +42.33% (p=0.000 n=18+20) ScanPi-8 89.1µs ± 1% 117.4µs ±12% +31.75% (p=0.000 n=18+20) StringPiParallel-8 35.1µs ± 9% 40.2µs ±12% +14.53% (p=0.000 n=20+20) Scan/10/Base2-8 410ns ±14% 429ns ±10% +4.47% (p=0.018 n=19+20) Scan/100/Base2-8 3.05µs ±20% 2.97µs ±14% ~ (p=0.449 n=20+20) Scan/1000/Base2-8 29.3µs ± 8% 30.1µs ±23% ~ (p=0.355 n=20+20) Scan/10000/Base2-8 402µs ±13% 395µs ±14% ~ (p=0.355 n=20+20) Scan/100000/Base2-8 11.8ms ±10% 11.6ms ± 1% ~ (p=0.245 n=17+18) Scan/10/Base8-8 194ns ± 6% 196ns ±12% ~ (p=0.829 n=20+19) Scan/100/Base8-8 1.11µs ±15% 1.11µs ±12% ~ (p=0.743 n=20+20) Scan/1000/Base8-8 11.7µs ±10% 11.7µs ±12% ~ (p=0.904 n=20+20) Scan/10000/Base8-8 209µs ± 7% 210µs ± 8% ~ (p=0.478 n=20+20) Scan/100000/Base8-8 10.6ms ± 7% 10.4ms ± 6% ~ (p=0.112 n=20+18) Scan/10/Base10-8 182ns ±12% 188ns ±11% +3.52% (p=0.044 n=20+20) Scan/100/Base10-8 1.01µs ± 8% 1.00µs ±13% ~ (p=0.588 n=20+20) Scan/1000/Base10-8 10.7µs ±20% 10.6µs ±14% ~ (p=0.560 n=20+20) Scan/10000/Base10-8 195µs ±10% 194µs ± 9% ~ (p=0.883 n=20+20) Scan/100000/Base10-8 10.6ms ± 2% 10.6ms ± 2% ~ (p=0.495 n=20+20) Scan/10/Base16-8 166ns ±10% 174ns ±17% ~ (p=0.072 n=20+20) Scan/100/Base16-8 836ns ±10% 826ns ±12% ~ (p=0.562 n=20+17) Scan/1000/Base16-8 8.96µs ±13% 8.65µs ± 9% ~ (p=0.203 n=20+18) Scan/10000/Base16-8 198µs ± 3% 198µs ± 5% ~ (p=0.718 n=20+20) Scan/100000/Base16-8 11.1ms ± 3% 11.0ms ± 4% ~ (p=0.512 n=20+20) String/10/Base2-8 88.1ns ± 7% 94.1ns ±11% +6.80% (p=0.000 n=19+20) String/100/Base2-8 577ns ± 4% 598ns ± 5% +3.72% (p=0.000 n=20+20) String/1000/Base2-8 5.25µs ± 2% 5.62µs ± 5% +7.04% (p=0.000 n=19+20) String/10000/Base2-8 55.6µs ± 1% 60.1µs ± 2% +8.12% (p=0.000 n=19+19) String/100000/Base2-8 519µs ± 2% 560µs ± 2% +7.91% (p=0.000 n=18+17) String/10/Base8-8 52.2ns ± 8% 53.3ns ±12% ~ (p=0.188 n=20+18) String/100/Base8-8 218ns ± 3% 232ns ±10% +6.66% (p=0.000 n=20+20) String/1000/Base8-8 1.84µs ± 3% 1.94µs ± 4% +5.07% (p=0.000 n=20+18) String/10000/Base8-8 18.1µs ± 2% 19.1µs ± 3% +5.84% (p=0.000 n=20+19) String/100000/Base8-8 184µs ± 2% 197µs ± 1% +7.15% (p=0.000 n=19+19) String/10/Base10-8 158ns ± 7% 146ns ± 6% -7.65% (p=0.000 n=20+19) String/100/Base10-8 807ns ± 2% 845ns ± 4% +4.79% (p=0.000 n=20+19) String/1000/Base10-8 3.99µs ± 3% 3.99µs ± 7% ~ (p=0.920 n=20+20) String/10000/Base10-8 20.8µs ± 6% 22.1µs ±10% +6.11% (p=0.000 n=19+20) String/100000/Base10-8 5.60ms ± 2% 5.59ms ± 2% ~ (p=0.749 n=20+19) String/10/Base16-8 49.0ns ±13% 49.3ns ±16% ~ (p=0.581 n=19+20) String/100/Base16-8 173ns ± 5% 185ns ± 6% +6.63% (p=0.000 n=20+18) String/1000/Base16-8 1.38µs ± 3% 1.49µs ±10% +8.27% (p=0.000 n=19+20) String/10000/Base16-8 13.5µs ± 2% 14.5µs ± 3% +7.08% (p=0.000 n=20+20) String/100000/Base16-8 138µs ± 4% 148µs ± 4% +7.57% (p=0.000 n=19+20) LeafSize/0-8 2.74ms ± 1% 2.79ms ± 2% +2.00% (p=0.000 n=19+19) LeafSize/1-8 24.8µs ± 4% 26.1µs ± 8% +5.33% (p=0.000 n=18+19) LeafSize/2-8 24.9µs ± 7% 25.0µs ± 8% ~ (p=0.989 n=20+19) LeafSize/3-8 97.6µs ± 3% 100.2µs ± 5% +2.66% (p=0.001 n=20+19) LeafSize/4-8 25.2µs ± 5% 25.4µs ± 5% ~ (p=0.173 n=19+20) LeafSize/5-8 118µs ± 2% 119µs ± 5% ~ (p=0.478 n=20+20) LeafSize/6-8 97.6µs ± 3% 100.1µs ± 8% +2.65% (p=0.021 n=20+19) LeafSize/7-8 65.6µs ± 5% 67.5µs ± 6% +2.92% (p=0.003 n=20+19) LeafSize/8-8 25.5µs ± 5% 25.6µs ± 6% ~ (p=0.461 n=19+20) LeafSize/9-8 134µs ± 4% 136µs ± 5% ~ (p=0.194 n=19+20) LeafSize/10-8 119µs ± 3% 122µs ± 3% +2.52% (p=0.000 n=20+19) LeafSize/11-8 115µs ± 5% 116µs ± 5% ~ (p=0.158 n=20+19) LeafSize/12-8 97.4µs ± 4% 100.3µs ± 5% +2.91% (p=0.003 n=19+20) LeafSize/13-8 93.1µs ± 4% 93.0µs ± 6% ~ (p=0.698 n=20+20) LeafSize/14-8 67.0µs ± 3% 69.7µs ± 6% +4.10% (p=0.000 n=20+20) LeafSize/15-8 48.3µs ± 2% 49.3µs ± 6% +1.91% (p=0.014 n=19+20) LeafSize/16-8 25.6µs ± 5% 25.6µs ± 6% ~ (p=0.947 n=20+20) LeafSize/32-8 30.1µs ± 4% 30.3µs ± 5% ~ (p=0.685 n=18+19) LeafSize/64-8 53.4µs ± 2% 54.0µs ± 3% ~ (p=0.053 n=19+19) ProbablyPrime/n=0-8 3.59ms ± 1% 3.55ms ± 1% -1.12% (p=0.000 n=20+18) ProbablyPrime/n=1-8 4.21ms ± 2% 4.17ms ± 2% -0.73% (p=0.018 n=20+19) ProbablyPrime/n=5-8 6.74ms ± 1% 6.72ms ± 1% ~ (p=0.102 n=20+20) ProbablyPrime/n=10-8 9.91ms ± 1% 9.89ms ± 2% ~ (p=0.322 n=19+20) ProbablyPrime/n=20-8 16.2ms ± 1% 16.1ms ± 2% -0.52% (p=0.006 n=19+19) ProbablyPrime/Lucas-8 2.94ms ± 1% 2.95ms ± 1% +0.52% (p=0.002 n=18+19) ProbablyPrime/MillerRabinBase2-8 641µs ± 2% 640µs ± 2% ~ (p=0.607 n=19+20) FloatSqrt/64-8 653ns ± 5% 704ns ± 5% +7.82% (p=0.000 n=19+20) FloatSqrt/128-8 1.32µs ± 3% 1.42µs ± 5% +7.29% (p=0.000 n=18+20) FloatSqrt/256-8 1.44µs ± 2% 1.45µs ± 4% ~ (p=0.089 n=19+19) FloatSqrt/1000-8 3.36µs ± 3% 3.42µs ± 5% +1.82% (p=0.012 n=20+20) FloatSqrt/10000-8 25.5µs ± 2% 27.5µs ± 7% +7.91% (p=0.000 n=18+19) FloatSqrt/100000-8 629µs ± 6% 663µs ± 9% +5.32% (p=0.000 n=18+20) FloatSqrt/1000000-8 46.4ms ± 2% 46.6ms ± 5% ~ (p=0.351 n=20+19) [Geo mean] 9.60µs 10.01µs +4.28% name old alloc/op new alloc/op delta DecimalConversion-8 54.0kB ± 0% 43.6kB ± 0% -19.40% (p=0.000 n=20+20) FloatString/100-8 400B ± 0% 400B ± 0% ~ (all equal) FloatString/1000-8 3.10kB ± 0% 3.10kB ± 0% ~ (all equal) FloatString/10000-8 52.1kB ± 0% 52.1kB ± 0% ~ (p=0.153 n=20+20) FloatString/100000-8 582kB ± 0% 582kB ± 0% ~ (all equal) FloatAdd/10-8 0.00B 0.00B ~ (all equal) FloatAdd/100-8 0.00B 0.00B ~ (all equal) FloatAdd/1000-8 0.00B 0.00B ~ (all equal) FloatAdd/10000-8 0.00B 0.00B ~ (all equal) FloatAdd/100000-8 0.00B 0.00B ~ (all equal) FloatSub/10-8 0.00B 0.00B ~ (all equal) FloatSub/100-8 0.00B 0.00B ~ (all equal) FloatSub/1000-8 0.00B 0.00B ~ (all equal) FloatSub/10000-8 0.00B 0.00B ~ (all equal) FloatSub/100000-8 0.00B 0.00B ~ (all equal) ParseFloatSmallExp-8 4.18kB ± 0% 3.60kB ± 0% -13.79% (p=0.000 n=20+20) ParseFloatLargeExp-8 18.9kB ± 0% 19.3kB ± 0% +2.25% (p=0.000 n=20+20) GCD10x10/WithoutXY-8 96.0B ± 0% 16.0B ± 0% -83.33% (p=0.000 n=20+20) GCD10x10/WithXY-8 240B ± 0% 88B ± 0% -63.33% (p=0.000 n=20+20) GCD10x100/WithoutXY-8 192B ± 0% 112B ± 0% -41.67% (p=0.000 n=20+20) GCD10x100/WithXY-8 464B ± 0% 424B ± 0% -8.62% (p=0.000 n=20+20) GCD10x1000/WithoutXY-8 416B ± 0% 336B ± 0% -19.23% (p=0.000 n=20+20) GCD10x1000/WithXY-8 1.25kB ± 0% 1.10kB ± 0% -12.18% (p=0.000 n=20+20) GCD10x10000/WithoutXY-8 2.91kB ± 0% 2.83kB ± 0% -2.75% (p=0.000 n=20+20) GCD10x10000/WithXY-8 8.70kB ± 0% 8.55kB ± 0% -1.76% (p=0.000 n=16+16) GCD10x100000/WithoutXY-8 27.2kB ± 0% 27.2kB ± 0% -0.29% (p=0.000 n=20+20) GCD10x100000/WithXY-8 82.4kB ± 0% 82.3kB ± 0% -0.17% (p=0.000 n=20+19) GCD100x100/WithoutXY-8 288B ± 0% 384B ± 0% +33.33% (p=0.000 n=20+20) GCD100x100/WithXY-8 464B ± 0% 576B ± 0% +24.14% (p=0.000 n=20+20) GCD100x1000/WithoutXY-8 640B ± 0% 688B ± 0% +7.50% (p=0.000 n=20+20) GCD100x1000/WithXY-8 1.52kB ± 0% 1.46kB ± 0% -3.68% (p=0.000 n=20+20) GCD100x10000/WithoutXY-8 4.24kB ± 0% 4.29kB ± 0% +1.13% (p=0.000 n=20+20) GCD100x10000/WithXY-8 11.1kB ± 0% 11.0kB ± 0% -0.51% (p=0.000 n=15+20) GCD100x100000/WithoutXY-8 40.9kB ± 0% 40.9kB ± 0% +0.12% (p=0.000 n=20+19) GCD100x100000/WithXY-8 110kB ± 0% 109kB ± 0% -0.08% (p=0.000 n=20+20) GCD1000x1000/WithoutXY-8 1.22kB ± 0% 1.06kB ± 0% -13.16% (p=0.000 n=20+20) GCD1000x1000/WithXY-8 2.37kB ± 0% 2.11kB ± 0% -10.83% (p=0.000 n=20+20) GCD1000x10000/WithoutXY-8 4.71kB ± 0% 4.63kB ± 0% -1.70% (p=0.000 n=20+19) GCD1000x10000/WithXY-8 28.2kB ± 0% 28.0kB ± 0% -0.43% (p=0.000 n=20+15) GCD1000x100000/WithoutXY-8 41.3kB ± 0% 41.2kB ± 0% -0.20% (p=0.000 n=20+16) GCD1000x100000/WithXY-8 301kB ± 0% 301kB ± 0% -0.13% (p=0.000 n=20+20) GCD10000x10000/WithoutXY-8 8.64kB ± 0% 8.48kB ± 0% -1.85% (p=0.000 n=20+20) GCD10000x10000/WithXY-8 57.2kB ± 0% 57.7kB ± 0% +0.80% (p=0.000 n=20+20) GCD10000x100000/WithoutXY-8 43.8kB ± 0% 43.7kB ± 0% -0.19% (p=0.000 n=20+18) GCD10000x100000/WithXY-8 2.08MB ± 0% 2.08MB ± 0% -0.02% (p=0.000 n=15+19) GCD100000x100000/WithoutXY-8 81.6kB ± 0% 81.4kB ± 0% -0.20% (p=0.000 n=20+20) GCD100000x100000/WithXY-8 4.32MB ± 0% 4.33MB ± 0% +0.12% (p=0.000 n=20+20) Hilbert-8 653kB ± 0% 313kB ± 0% -52.13% (p=0.000 n=19+20) Binomial-8 1.82kB ± 0% 1.02kB ± 0% -43.86% (p=0.000 n=20+20) QuoRem-8 0.00B 0.00B ~ (all equal) Exp-8 11.1kB ± 0% 11.0kB ± 0% -0.34% (p=0.000 n=19+20) Exp2-8 11.3kB ± 0% 11.3kB ± 0% -0.35% (p=0.000 n=19+20) Bitset-8 0.00B 0.00B ~ (all equal) BitsetNeg-8 0.00B 0.00B ~ (all equal) BitsetOrig-8 103B ± 0% 63B ± 0% -38.83% (p=0.000 n=20+20) BitsetNegOrig-8 215B ± 0% 175B ± 0% -18.60% (p=0.000 n=20+20) ModSqrt225_Tonelli-8 11.3kB ± 0% 11.0kB ± 0% -2.76% (p=0.000 n=20+17) ModSqrt225_3Mod4-8 3.57kB ± 0% 3.53kB ± 0% -1.12% (p=0.000 n=20+20) ModSqrt231_Tonelli-8 11.0kB ± 0% 10.7kB ± 0% -2.55% (p=0.000 n=20+20) ModSqrt231_5Mod8-8 4.21kB ± 0% 4.09kB ± 0% -2.85% (p=0.000 n=16+20) ModInverse-8 1.44kB ± 0% 1.28kB ± 0% -11.11% (p=0.000 n=20+20) Sqrt-8 6.00kB ± 0% 6.00kB ± 0% ~ (all equal) IntSqr/1-8 0.00B 0.00B ~ (all equal) IntSqr/2-8 0.00B 0.00B ~ (all equal) IntSqr/3-8 0.00B 0.00B ~ (all equal) IntSqr/5-8 0.00B 0.00B ~ (all equal) IntSqr/8-8 0.00B 0.00B ~ (all equal) IntSqr/10-8 0.00B 0.00B ~ (all equal) IntSqr/20-8 320B ± 0% 320B ± 0% ~ (all equal) IntSqr/30-8 480B ± 0% 480B ± 0% ~ (all equal) IntSqr/50-8 896B ± 0% 896B ± 0% ~ (all equal) IntSqr/80-8 1.28kB ± 0% 1.28kB ± 0% ~ (all equal) IntSqr/100-8 1.79kB ± 0% 1.79kB ± 0% ~ (all equal) IntSqr/200-8 3.20kB ± 0% 3.20kB ± 0% ~ (all equal) IntSqr/300-8 8.06kB ± 0% 8.06kB ± 0% ~ (all equal) IntSqr/500-8 12.3kB ± 0% 12.3kB ± 0% ~ (all equal) IntSqr/800-8 28.8kB ± 0% 28.8kB ± 0% ~ (all equal) IntSqr/1000-8 36.9kB ± 0% 36.9kB ± 0% ~ (all equal) Div/20/10-8 0.00B 0.00B ~ (all equal) Div/200/100-8 0.00B 0.00B ~ (all equal) Div/2000/1000-8 0.00B 0.00B ~ (all equal) Div/20000/10000-8 0.00B 0.00B ~ (all equal) Div/200000/100000-8 690B ± 0% 690B ± 0% ~ (all equal) Mul-8 565kB ± 0% 565kB ± 0% ~ (all equal) ZeroShifts/Shl-8 6.53kB ± 0% 6.53kB ± 0% ~ (all equal) ZeroShifts/ShlSame-8 0.00B 0.00B ~ (all equal) ZeroShifts/Shr-8 6.53kB ± 0% 6.53kB ± 0% ~ (all equal) ZeroShifts/ShrSame-8 0.00B 0.00B ~ (all equal) Exp3Power/0x10-8 192B ± 0% 112B ± 0% -41.67% (p=0.000 n=20+20) Exp3Power/0x40-8 192B ± 0% 112B ± 0% -41.67% (p=0.000 n=20+20) Exp3Power/0x100-8 288B ± 0% 208B ± 0% -27.78% (p=0.000 n=20+20) Exp3Power/0x400-8 672B ± 0% 592B ± 0% -11.90% (p=0.000 n=20+20) Exp3Power/0x1000-8 3.33kB ± 0% 3.25kB ± 0% -2.40% (p=0.000 n=20+20) Exp3Power/0x4000-8 13.8kB ± 0% 13.7kB ± 0% -0.58% (p=0.000 n=20+20) Exp3Power/0x10000-8 117kB ± 0% 117kB ± 0% -0.07% (p=0.000 n=20+20) Exp3Power/0x40000-8 755kB ± 0% 755kB ± 0% -0.01% (p=0.000 n=19+20) Exp3Power/0x100000-8 5.22MB ± 0% 5.22MB ± 0% -0.00% (p=0.000 n=20+20) Exp3Power/0x400000-8 39.8MB ± 0% 39.8MB ± 0% -0.00% (p=0.000 n=20+19) Fibo-8 3.09MB ± 0% 3.08MB ± 0% -0.28% (p=0.000 n=20+16) NatSqr/1-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) NatSqr/2-8 64.0B ± 0% 64.0B ± 0% ~ (all equal) NatSqr/3-8 80.0B ± 0% 80.0B ± 0% ~ (all equal) NatSqr/5-8 112B ± 0% 112B ± 0% ~ (all equal) NatSqr/8-8 160B ± 0% 160B ± 0% ~ (all equal) NatSqr/10-8 192B ± 0% 192B ± 0% ~ (all equal) NatSqr/20-8 672B ± 0% 672B ± 0% ~ (all equal) NatSqr/30-8 992B ± 0% 992B ± 0% ~ (all equal) NatSqr/50-8 1.79kB ± 0% 1.79kB ± 0% ~ (all equal) NatSqr/80-8 2.69kB ± 0% 2.69kB ± 0% ~ (all equal) NatSqr/100-8 3.58kB ± 0% 3.58kB ± 0% ~ (all equal) NatSqr/200-8 6.66kB ± 0% 6.66kB ± 0% ~ (all equal) NatSqr/300-8 24.4kB ± 0% 24.4kB ± 0% ~ (all equal) NatSqr/500-8 36.9kB ± 0% 36.9kB ± 0% ~ (all equal) NatSqr/800-8 69.8kB ± 0% 69.8kB ± 0% ~ (all equal) NatSqr/1000-8 86.0kB ± 0% 86.0kB ± 0% ~ (all equal) NatSetBytes/8-8 0.00B 0.00B ~ (all equal) NatSetBytes/24-8 64.0B ± 0% 64.0B ± 0% ~ (all equal) NatSetBytes/128-8 160B ± 0% 160B ± 0% ~ (all equal) NatSetBytes/7-8 0.00B 0.00B ~ (all equal) NatSetBytes/23-8 64.0B ± 0% 64.0B ± 0% ~ (all equal) NatSetBytes/127-8 160B ± 0% 160B ± 0% ~ (all equal) ScanPi-8 75.4kB ± 0% 75.7kB ± 0% +0.41% (p=0.000 n=20+20) StringPiParallel-8 20.4kB ± 0% 20.4kB ± 0% ~ (p=0.223 n=20+20) Scan/10/Base2-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/100/Base2-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/1000/Base2-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/10000/Base2-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/100000/Base2-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/10/Base8-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/100/Base8-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/1000/Base8-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/10000/Base8-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/100000/Base8-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/10/Base10-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/100/Base10-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/1000/Base10-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/10000/Base10-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/100000/Base10-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/10/Base16-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/100/Base16-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/1000/Base16-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/10000/Base16-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) Scan/100000/Base16-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) String/10/Base2-8 48.0B ± 0% 48.0B ± 0% ~ (all equal) String/100/Base2-8 352B ± 0% 352B ± 0% ~ (all equal) String/1000/Base2-8 3.46kB ± 0% 3.46kB ± 0% ~ (all equal) String/10000/Base2-8 41.0kB ± 0% 41.0kB ± 0% ~ (all equal) String/100000/Base2-8 336kB ± 0% 336kB ± 0% ~ (all equal) String/10/Base8-8 16.0B ± 0% 16.0B ± 0% ~ (all equal) String/100/Base8-8 112B ± 0% 112B ± 0% ~ (all equal) String/1000/Base8-8 1.15kB ± 0% 1.15kB ± 0% ~ (all equal) String/10000/Base8-8 12.3kB ± 0% 12.3kB ± 0% ~ (all equal) String/100000/Base8-8 115kB ± 0% 115kB ± 0% ~ (all equal) String/10/Base10-8 64.0B ± 0% 24.0B ± 0% -62.50% (p=0.000 n=20+20) String/100/Base10-8 192B ± 0% 192B ± 0% ~ (all equal) String/1000/Base10-8 1.95kB ± 0% 1.95kB ± 0% ~ (all equal) String/10000/Base10-8 20.0kB ± 0% 20.0kB ± 0% ~ (p=0.983 n=19+20) String/100000/Base10-8 210kB ± 1% 211kB ± 1% +0.82% (p=0.000 n=19+20) String/10/Base16-8 16.0B ± 0% 16.0B ± 0% ~ (all equal) String/100/Base16-8 96.0B ± 0% 96.0B ± 0% ~ (all equal) String/1000/Base16-8 896B ± 0% 896B ± 0% ~ (all equal) String/10000/Base16-8 9.47kB ± 0% 9.47kB ± 0% ~ (all equal) String/100000/Base16-8 90.1kB ± 0% 90.1kB ± 0% ~ (all equal) LeafSize/0-8 16.9kB ± 0% 16.8kB ± 0% -0.44% (p=0.000 n=20+20) LeafSize/1-8 22.4kB ± 0% 22.3kB ± 0% -0.34% (p=0.000 n=20+19) LeafSize/2-8 22.4kB ± 0% 22.3kB ± 0% -0.34% (p=0.000 n=20+19) LeafSize/3-8 22.4kB ± 0% 22.3kB ± 0% -0.34% (p=0.000 n=20+17) LeafSize/4-8 22.4kB ± 0% 22.3kB ± 0% -0.34% (p=0.000 n=20+19) LeafSize/5-8 22.4kB ± 0% 22.3kB ± 0% -0.33% (p=0.000 n=20+20) LeafSize/6-8 22.3kB ± 0% 22.2kB ± 0% -0.34% (p=0.000 n=20+20) LeafSize/7-8 22.3kB ± 0% 22.2kB ± 0% -0.35% (p=0.000 n=20+20) LeafSize/8-8 22.3kB ± 0% 22.2kB ± 0% -0.34% (p=0.000 n=16+20) LeafSize/9-8 22.3kB ± 0% 22.2kB ± 0% -0.33% (p=0.000 n=20+20) LeafSize/10-8 22.3kB ± 0% 22.2kB ± 0% -0.33% (p=0.000 n=20+20) LeafSize/11-8 22.3kB ± 0% 22.2kB ± 0% -0.33% (p=0.000 n=20+20) LeafSize/12-8 22.3kB ± 0% 22.2kB ± 0% -0.33% (p=0.000 n=20+20) LeafSize/13-8 22.3kB ± 0% 22.2kB ± 0% -0.34% (p=0.000 n=20+15) LeafSize/14-8 22.3kB ± 0% 22.2kB ± 0% -0.33% (p=0.000 n=20+20) LeafSize/15-8 22.3kB ± 0% 22.2kB ± 0% -0.33% (p=0.000 n=20+20) LeafSize/16-8 22.3kB ± 0% 22.2kB ± 0% -0.33% (p=0.000 n=19+20) LeafSize/32-8 22.3kB ± 0% 22.2kB ± 0% -0.32% (p=0.000 n=20+20) LeafSize/64-8 21.8kB ± 0% 21.7kB ± 0% -0.33% (p=0.000 n=18+19) ProbablyPrime/n=0-8 15.3kB ± 0% 14.9kB ± 0% -2.35% (p=0.000 n=20+20) ProbablyPrime/n=1-8 21.0kB ± 0% 20.7kB ± 0% -1.71% (p=0.000 n=20+20) ProbablyPrime/n=5-8 43.4kB ± 0% 42.9kB ± 0% -1.20% (p=0.000 n=20+20) ProbablyPrime/n=10-8 71.5kB ± 0% 70.7kB ± 0% -1.01% (p=0.000 n=19+20) ProbablyPrime/n=20-8 127kB ± 0% 126kB ± 0% -0.88% (p=0.000 n=20+20) ProbablyPrime/Lucas-8 3.07kB ± 0% 2.79kB ± 0% -9.12% (p=0.000 n=20+20) ProbablyPrime/MillerRabinBase2-8 12.1kB ± 0% 12.0kB ± 0% -0.66% (p=0.000 n=20+20) FloatSqrt/64-8 416B ± 0% 360B ± 0% -13.46% (p=0.000 n=20+20) FloatSqrt/128-8 640B ± 0% 584B ± 0% -8.75% (p=0.000 n=20+20) FloatSqrt/256-8 512B ± 0% 472B ± 0% -7.81% (p=0.000 n=20+20) FloatSqrt/1000-8 1.47kB ± 0% 1.43kB ± 0% -2.72% (p=0.000 n=20+20) FloatSqrt/10000-8 18.2kB ± 0% 18.1kB ± 0% -0.22% (p=0.000 n=20+20) FloatSqrt/100000-8 204kB ± 0% 204kB ± 0% -0.02% (p=0.000 n=20+20) FloatSqrt/1000000-8 6.37MB ± 0% 6.37MB ± 0% -0.00% (p=0.000 n=19+20) [Geo mean] 3.42kB 3.24kB -5.33% name old allocs/op new allocs/op delta DecimalConversion-8 1.65k ± 0% 1.65k ± 0% ~ (all equal) FloatString/100-8 8.00 ± 0% 8.00 ± 0% ~ (all equal) FloatString/1000-8 9.00 ± 0% 9.00 ± 0% ~ (all equal) FloatString/10000-8 22.0 ± 0% 22.0 ± 0% ~ (all equal) FloatString/100000-8 136 ± 0% 136 ± 0% ~ (all equal) FloatAdd/10-8 0.00 0.00 ~ (all equal) FloatAdd/100-8 0.00 0.00 ~ (all equal) FloatAdd/1000-8 0.00 0.00 ~ (all equal) FloatAdd/10000-8 0.00 0.00 ~ (all equal) FloatAdd/100000-8 0.00 0.00 ~ (all equal) FloatSub/10-8 0.00 0.00 ~ (all equal) FloatSub/100-8 0.00 0.00 ~ (all equal) FloatSub/1000-8 0.00 0.00 ~ (all equal) FloatSub/10000-8 0.00 0.00 ~ (all equal) FloatSub/100000-8 0.00 0.00 ~ (all equal) ParseFloatSmallExp-8 110 ± 0% 130 ± 0% +18.18% (p=0.000 n=20+20) ParseFloatLargeExp-8 319 ± 0% 371 ± 0% +16.30% (p=0.000 n=20+20) GCD10x10/WithoutXY-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) GCD10x10/WithXY-8 5.00 ± 0% 6.00 ± 0% +20.00% (p=0.000 n=20+20) GCD10x100/WithoutXY-8 4.00 ± 0% 4.00 ± 0% ~ (all equal) GCD10x100/WithXY-8 9.00 ± 0% 12.00 ± 0% +33.33% (p=0.000 n=20+20) GCD10x1000/WithoutXY-8 4.00 ± 0% 4.00 ± 0% ~ (all equal) GCD10x1000/WithXY-8 11.0 ± 0% 12.0 ± 0% +9.09% (p=0.000 n=20+20) GCD10x10000/WithoutXY-8 4.00 ± 0% 4.00 ± 0% ~ (all equal) GCD10x10000/WithXY-8 11.0 ± 0% 12.0 ± 0% +9.09% (p=0.000 n=20+20) GCD10x100000/WithoutXY-8 4.00 ± 0% 4.00 ± 0% ~ (all equal) GCD10x100000/WithXY-8 11.0 ± 0% 12.0 ± 0% +9.09% (p=0.000 n=20+20) GCD100x100/WithoutXY-8 6.00 ± 0% 10.00 ± 0% +66.67% (p=0.000 n=20+20) GCD100x100/WithXY-8 9.00 ± 0% 15.00 ± 0% +66.67% (p=0.000 n=20+20) GCD100x1000/WithoutXY-8 6.00 ± 0% 8.00 ± 0% +33.33% (p=0.000 n=20+20) GCD100x1000/WithXY-8 12.0 ± 0% 13.0 ± 0% +8.33% (p=0.000 n=20+20) GCD100x10000/WithoutXY-8 6.00 ± 0% 8.00 ± 0% +33.33% (p=0.000 n=20+20) GCD100x10000/WithXY-8 12.0 ± 0% 13.0 ± 0% +8.33% (p=0.000 n=20+20) GCD100x100000/WithoutXY-8 6.00 ± 0% 8.00 ± 0% +33.33% (p=0.000 n=20+20) GCD100x100000/WithXY-8 12.0 ± 0% 13.0 ± 0% +8.33% (p=0.000 n=20+20) GCD1000x1000/WithoutXY-8 10.0 ± 0% 10.0 ± 0% ~ (all equal) GCD1000x1000/WithXY-8 19.0 ± 0% 20.0 ± 0% +5.26% (p=0.000 n=20+20) GCD1000x10000/WithoutXY-8 8.00 ± 0% 8.00 ± 0% ~ (all equal) GCD1000x10000/WithXY-8 26.0 ± 0% 26.0 ± 0% ~ (all equal) GCD1000x100000/WithoutXY-8 8.00 ± 0% 8.00 ± 0% ~ (all equal) GCD1000x100000/WithXY-8 27.0 ± 0% 27.0 ± 0% ~ (all equal) GCD10000x10000/WithoutXY-8 10.0 ± 0% 10.0 ± 0% ~ (all equal) GCD10000x10000/WithXY-8 76.0 ± 0% 78.0 ± 0% +2.63% (p=0.000 n=20+20) GCD10000x100000/WithoutXY-8 8.00 ± 0% 8.00 ± 0% ~ (all equal) GCD10000x100000/WithXY-8 174 ± 0% 174 ± 0% ~ (all equal) GCD100000x100000/WithoutXY-8 10.0 ± 0% 10.0 ± 0% ~ (all equal) GCD100000x100000/WithXY-8 645 ± 0% 647 ± 0% +0.31% (p=0.000 n=20+20) Hilbert-8 14.1k ± 0% 14.3k ± 0% +0.92% (p=0.000 n=20+20) Binomial-8 38.0 ± 0% 38.0 ± 0% ~ (all equal) QuoRem-8 0.00 0.00 ~ (all equal) Exp-8 21.0 ± 0% 21.0 ± 0% ~ (all equal) Exp2-8 22.0 ± 0% 22.0 ± 0% ~ (all equal) Bitset-8 0.00 0.00 ~ (all equal) BitsetNeg-8 0.00 0.00 ~ (all equal) BitsetOrig-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) BitsetNegOrig-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) ModSqrt225_Tonelli-8 85.0 ± 0% 86.0 ± 0% +1.18% (p=0.000 n=20+20) ModSqrt225_3Mod4-8 25.0 ± 0% 25.0 ± 0% ~ (all equal) ModSqrt231_Tonelli-8 80.0 ± 0% 80.0 ± 0% ~ (all equal) ModSqrt231_5Mod8-8 32.0 ± 0% 32.0 ± 0% ~ (all equal) ModInverse-8 11.0 ± 0% 11.0 ± 0% ~ (all equal) Sqrt-8 13.0 ± 0% 13.0 ± 0% ~ (all equal) IntSqr/1-8 0.00 0.00 ~ (all equal) IntSqr/2-8 0.00 0.00 ~ (all equal) IntSqr/3-8 0.00 0.00 ~ (all equal) IntSqr/5-8 0.00 0.00 ~ (all equal) IntSqr/8-8 0.00 0.00 ~ (all equal) IntSqr/10-8 0.00 0.00 ~ (all equal) IntSqr/20-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) IntSqr/30-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) IntSqr/50-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) IntSqr/80-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) IntSqr/100-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) IntSqr/200-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) IntSqr/300-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) IntSqr/500-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) IntSqr/800-8 9.00 ± 0% 9.00 ± 0% ~ (all equal) IntSqr/1000-8 9.00 ± 0% 9.00 ± 0% ~ (all equal) Div/20/10-8 0.00 0.00 ~ (all equal) Div/200/100-8 0.00 0.00 ~ (all equal) Div/2000/1000-8 0.00 0.00 ~ (all equal) Div/20000/10000-8 0.00 0.00 ~ (all equal) Div/200000/100000-8 0.00 0.00 ~ (all equal) Mul-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) ZeroShifts/Shl-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) ZeroShifts/ShlSame-8 0.00 0.00 ~ (all equal) ZeroShifts/Shr-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) ZeroShifts/ShrSame-8 0.00 0.00 ~ (all equal) Exp3Power/0x10-8 4.00 ± 0% 4.00 ± 0% ~ (all equal) Exp3Power/0x40-8 4.00 ± 0% 4.00 ± 0% ~ (all equal) Exp3Power/0x100-8 5.00 ± 0% 5.00 ± 0% ~ (all equal) Exp3Power/0x400-8 7.00 ± 0% 7.00 ± 0% ~ (all equal) Exp3Power/0x1000-8 11.0 ± 0% 11.0 ± 0% ~ (all equal) Exp3Power/0x4000-8 15.0 ± 0% 15.0 ± 0% ~ (all equal) Exp3Power/0x10000-8 29.0 ± 0% 29.0 ± 0% ~ (all equal) Exp3Power/0x40000-8 140 ± 0% 140 ± 0% ~ (all equal) Exp3Power/0x100000-8 1.12k ± 0% 1.12k ± 0% ~ (all equal) Exp3Power/0x400000-8 9.88k ± 0% 9.88k ± 0% ~ (p=0.747 n=17+19) Fibo-8 739 ± 0% 743 ± 0% +0.54% (p=0.000 n=20+20) NatSqr/1-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) NatSqr/2-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) NatSqr/3-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) NatSqr/5-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) NatSqr/8-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) NatSqr/10-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) NatSqr/20-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) NatSqr/30-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) NatSqr/50-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) NatSqr/80-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) NatSqr/100-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) NatSqr/200-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) NatSqr/300-8 4.00 ± 0% 4.00 ± 0% ~ (all equal) NatSqr/500-8 4.00 ± 0% 4.00 ± 0% ~ (all equal) NatSqr/800-8 10.0 ± 0% 10.0 ± 0% ~ (all equal) NatSqr/1000-8 10.0 ± 0% 10.0 ± 0% ~ (all equal) NatSetBytes/8-8 0.00 0.00 ~ (all equal) NatSetBytes/24-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) NatSetBytes/128-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) NatSetBytes/7-8 0.00 0.00 ~ (all equal) NatSetBytes/23-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) NatSetBytes/127-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) ScanPi-8 60.0 ± 0% 61.0 ± 0% +1.67% (p=0.000 n=20+20) StringPiParallel-8 24.0 ± 0% 24.0 ± 0% ~ (all equal) Scan/10/Base2-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/100/Base2-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/1000/Base2-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/10000/Base2-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/100000/Base2-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/10/Base8-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/100/Base8-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/1000/Base8-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/10000/Base8-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/100000/Base8-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/10/Base10-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/100/Base10-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/1000/Base10-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/10000/Base10-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/100000/Base10-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/10/Base16-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/100/Base16-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/1000/Base16-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/10000/Base16-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Scan/100000/Base16-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) String/10/Base2-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) String/100/Base2-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) String/1000/Base2-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) String/10000/Base2-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) String/100000/Base2-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) String/10/Base8-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) String/100/Base8-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) String/1000/Base8-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) String/10000/Base8-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) String/100000/Base8-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) String/10/Base10-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) String/100/Base10-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) String/1000/Base10-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) String/10000/Base10-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) String/100000/Base10-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) String/10/Base16-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) String/100/Base16-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) String/1000/Base16-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) String/10000/Base16-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) String/100000/Base16-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) LeafSize/0-8 10.0 ± 0% 10.0 ± 0% ~ (all equal) LeafSize/1-8 13.0 ± 0% 13.0 ± 0% ~ (all equal) LeafSize/2-8 13.0 ± 0% 13.0 ± 0% ~ (all equal) LeafSize/3-8 13.0 ± 0% 13.0 ± 0% ~ (all equal) LeafSize/4-8 13.0 ± 0% 13.0 ± 0% ~ (all equal) LeafSize/5-8 13.0 ± 0% 13.0 ± 0% ~ (all equal) LeafSize/6-8 12.0 ± 0% 12.0 ± 0% ~ (all equal) LeafSize/7-8 12.0 ± 0% 12.0 ± 0% ~ (all equal) LeafSize/8-8 12.0 ± 0% 12.0 ± 0% ~ (all equal) LeafSize/9-8 12.0 ± 0% 12.0 ± 0% ~ (all equal) LeafSize/10-8 12.0 ± 0% 12.0 ± 0% ~ (all equal) LeafSize/11-8 12.0 ± 0% 12.0 ± 0% ~ (all equal) LeafSize/12-8 12.0 ± 0% 12.0 ± 0% ~ (all equal) LeafSize/13-8 12.0 ± 0% 12.0 ± 0% ~ (all equal) LeafSize/14-8 12.0 ± 0% 12.0 ± 0% ~ (all equal) LeafSize/15-8 12.0 ± 0% 12.0 ± 0% ~ (all equal) LeafSize/16-8 12.0 ± 0% 12.0 ± 0% ~ (all equal) LeafSize/32-8 12.0 ± 0% 12.0 ± 0% ~ (all equal) LeafSize/64-8 11.0 ± 0% 11.0 ± 0% ~ (all equal) ProbablyPrime/n=0-8 52.0 ± 0% 52.0 ± 0% ~ (all equal) ProbablyPrime/n=1-8 73.0 ± 0% 73.0 ± 0% ~ (all equal) ProbablyPrime/n=5-8 157 ± 0% 157 ± 0% ~ (all equal) ProbablyPrime/n=10-8 262 ± 0% 262 ± 0% ~ (all equal) ProbablyPrime/n=20-8 472 ± 0% 472 ± 0% ~ (all equal) ProbablyPrime/Lucas-8 22.0 ± 0% 22.0 ± 0% ~ (all equal) ProbablyPrime/MillerRabinBase2-8 29.0 ± 0% 29.0 ± 0% ~ (all equal) FloatSqrt/64-8 9.00 ± 0% 10.00 ± 0% +11.11% (p=0.000 n=20+20) FloatSqrt/128-8 12.0 ± 0% 13.0 ± 0% +8.33% (p=0.000 n=20+20) FloatSqrt/256-8 8.00 ± 0% 8.00 ± 0% ~ (all equal) FloatSqrt/1000-8 9.00 ± 0% 9.00 ± 0% ~ (all equal) FloatSqrt/10000-8 14.0 ± 0% 14.0 ± 0% ~ (all equal) FloatSqrt/100000-8 33.0 ± 0% 33.0 ± 0% ~ (all equal) FloatSqrt/1000000-8 1.16k ± 0% 1.16k ± 0% ~ (all equal) [Geo mean] 6.62 6.76 +2.09% Change-Id: Id9df4157cac1e07721e35cff7fcdefe60703873a Reviewed-on: https://go-review.googlesource.com/c/150999 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Alan Donovan <adonovan@google.com> Reviewed-by: Robert Griesemer <gri@golang.org> |
|
|
|
979d9027ae |
math/bits: define Div to panic when y<=hi
Div panics when y<=hi because either the quotient overflows the size of the output or division by zero occurs when y==0. This provides a uniform behavior for all implementations. Fixes #28316 Change-Id: If23aeb10e0709ee1a60b7d614afc9103d674a980 Reviewed-on: https://go-review.googlesource.com/c/149517 Reviewed-by: Robert Griesemer <gri@golang.org> |
|
|
|
ead5d1e316 |
math/bits: panic when y<=hi in Div
Explicitly check for divide-by-zero/overflow and panic with the appropriate runtime error. The additional checks have basically no effect on performance since the branch is easily predicted. name old time/op new time/op delta Div-4 53.9ns ± 1% 53.0ns ± 1% -1.59% (p=0.016 n=4+5) Div32-4 17.9ns ± 0% 18.4ns ± 0% +2.56% (p=0.008 n=5+5) Div64-4 53.5ns ± 0% 53.3ns ± 0% ~ (p=0.095 n=5+5) Updates #28316 Change-Id: I36297ee9946cbbc57fefb44d1730283b049ecf57 Reviewed-on: https://go-review.googlesource.com/c/144377 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org> |
|
|
|
3813edf26e |
all: use "reports whether" consistently in the few places that didn't
Go documentation style for boolean funcs is to say:
// Foo reports whether ...
func Foo() bool
(rather than "returns true if")
This CL also replaces 4 uses of "iff" with the same "reports whether"
wording, which doesn't lose any meaning, and will prevent people from
sending typo fixes when they don't realize it's "if and only if". In
the past I think we've had the typo CLs updated to just say "reports
whether". So do them all at once.
(Inspired by the addition of another "returns true if" in CL 146938
in fd_plan9.go)
Created with:
$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true iff" | grep -v vendor)
$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true if" | grep -v vendor)
Change-Id: Ided502237f5ab0d25cb625dbab12529c361a8b9f
Reviewed-on: https://go-review.googlesource.com/c/147037
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
|
|
c86d464734 |
math/big: shallow copies of Int/Rat/Float are not supported (documentation)
Fixes #28423. Change-Id: Ie57ade565d0407a4bffaa86fb4475ff083168e79 Reviewed-on: https://go-review.googlesource.com/c/145537 Reviewed-by: Ian Lance Taylor <iant@golang.org> |
|
|
|
f28191340e |
math/big: fix a formula used as documentation
The function documentation was wrong, it was using a wrong parameter. This change replaces it with the right parameter. The wrong formula was: q = (u1<<_W + u0 - r)/y The function has got a parameter "v" (of type Word), not a parameter "y". So, the right formula is: q = (u1<<_W + u0 - r)/v Fixes #28444 Change-Id: I82e57ba014735a9fdb6262874ddf498754d30d33 Reviewed-on: https://go-review.googlesource.com/c/145280 Reviewed-by: Robert Griesemer <gri@golang.org> |
|
|
|
899f3a2892 |
cmd/compile: intrinsify math/bits.Add on amd64
name old time/op new time/op delta Add-8 1.11ns ± 0% 1.18ns ± 0% +6.31% (p=0.029 n=4+4) Add32-8 1.02ns ± 0% 1.02ns ± 1% ~ (p=0.333 n=4+5) Add64-8 1.11ns ± 1% 1.17ns ± 0% +5.79% (p=0.008 n=5+5) Add64multiple-8 4.35ns ± 1% 0.86ns ± 0% -80.22% (p=0.000 n=5+4) The individual ops are a bit slower (but still very fast). Using the ops in carry chains is very fast. Update #28273 Change-Id: Id975f76df2b930abf0e412911d327b6c5b1befe5 Reviewed-on: https://go-review.googlesource.com/c/144257 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> |
|
|
|
127c51e48c |
math/bits: correct BenchmarkSub64
Previously, the benchmark was measuring Add64 instead of Sub64. Change-Id: I0cf30935c8a4728bead9868834377aae0b34f008 Reviewed-on: https://go-review.googlesource.com/c/144380 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
f90e89e675 |
all: fix a bunch of misspellings
Change-Id: If2954bdfc551515403706b2cd0dde94e45936e08
GitHub-Last-Rev:
|
|
|
|
47e71f3b69 |
math: use Abs in Pow rather than if x < 0 { x = -x }
name old time/op new time/op delta PowInt 55.7ns ± 1% 53.4ns ± 2% -4.15% (p=0.000 n=9+9) PowFrac 133ns ± 1% 133ns ± 2% ~ (p=0.587 n=8+9) Change-Id: Ica0f4c2cbd554f2195c6d1762ed26742ff8e3924 Reviewed-on: https://go-review.googlesource.com/c/85375 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
497d24178f |
math: use Abs in Mod rather than if x < 0 { x = -x}
goos: linux goarch: amd64 pkg: math name old time/op new time/op delta Mod 64.7ns ± 2% 63.7ns ± 2% -1.52% (p=0.003 n=8+10) Change-Id: I851bec0fd6c223dab73e4a680b7393d49e81a0e8 Reviewed-on: https://go-review.googlesource.com/c/85095 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
b8ac64a581 |
all: this big patch remove whitespace from assembly files
Don't worry, this patch just remove trailing whitespace from assembly files, and does not touch any logical changes. Change-Id: Ia724ac0b1abf8bc1e41454bdc79289ef317c165d Reviewed-on: https://go-review.googlesource.com/c/113595 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
a19a83c8ef |
cmd/compile: optimize math.Float64(32)bits and math.Float64(32)frombits on arm64
Use float <-> int register moves without conversion instead of stores and loads to move float <-> int values. Math package benchmark results. name old time/op new time/op delta Acosh 153ns ± 0% 147ns ± 0% -3.92% (p=0.000 n=10+10) Asinh 183ns ± 0% 177ns ± 0% -3.28% (p=0.000 n=10+10) Atanh 157ns ± 0% 155ns ± 0% -1.27% (p=0.000 n=10+10) Atan2 118ns ± 0% 117ns ± 1% -0.59% (p=0.003 n=10+10) Cbrt 119ns ± 0% 114ns ± 0% -4.20% (p=0.000 n=10+10) Copysign 7.51ns ± 0% 6.51ns ± 0% -13.32% (p=0.000 n=9+10) Cos 73.1ns ± 0% 70.6ns ± 0% -3.42% (p=0.000 n=10+10) Cosh 119ns ± 0% 121ns ± 0% +1.68% (p=0.000 n=10+9) ExpGo 154ns ± 0% 149ns ± 0% -3.05% (p=0.000 n=9+10) Expm1 101ns ± 0% 99ns ± 0% -1.88% (p=0.000 n=10+10) Exp2Go 150ns ± 0% 146ns ± 0% -2.67% (p=0.000 n=10+10) Abs 7.01ns ± 0% 6.01ns ± 0% -14.27% (p=0.000 n=10+9) Mod 234ns ± 0% 212ns ± 0% -9.40% (p=0.000 n=9+10) Frexp 34.5ns ± 0% 30.0ns ± 0% -13.04% (p=0.000 n=10+10) Gamma 112ns ± 0% 111ns ± 0% -0.89% (p=0.000 n=10+10) Hypot 73.6ns ± 0% 68.6ns ± 0% -6.79% (p=0.000 n=10+10) HypotGo 77.1ns ± 0% 72.1ns ± 0% -6.49% (p=0.000 n=10+10) Ilogb 31.0ns ± 0% 28.0ns ± 0% -9.68% (p=0.000 n=10+10) J0 437ns ± 0% 434ns ± 0% -0.62% (p=0.000 n=10+10) J1 433ns ± 0% 431ns ± 0% -0.46% (p=0.000 n=10+10) Jn 927ns ± 0% 922ns ± 0% -0.54% (p=0.000 n=10+10) Ldexp 41.5ns ± 0% 37.0ns ± 0% -10.84% (p=0.000 n=9+10) Log 124ns ± 0% 118ns ± 0% -4.84% (p=0.000 n=10+9) Logb 34.0ns ± 0% 32.0ns ± 0% -5.88% (p=0.000 n=10+10) Log1p 110ns ± 0% 108ns ± 0% -1.82% (p=0.000 n=10+10) Log10 136ns ± 0% 132ns ± 0% -2.94% (p=0.000 n=10+10) Log2 51.6ns ± 0% 47.1ns ± 0% -8.72% (p=0.000 n=10+10) Nextafter32 33.0ns ± 0% 30.5ns ± 0% -7.58% (p=0.000 n=10+10) Nextafter64 29.0ns ± 0% 26.5ns ± 0% -8.62% (p=0.000 n=10+10) PowInt 169ns ± 0% 160ns ± 0% -5.33% (p=0.000 n=10+10) PowFrac 375ns ± 0% 361ns ± 0% -3.73% (p=0.000 n=10+10) RoundToEven 14.0ns ± 0% 12.5ns ± 0% -10.71% (p=0.000 n=10+10) Remainder 206ns ± 0% 192ns ± 0% -6.80% (p=0.000 n=10+9) Signbit 6.01ns ± 0% 5.51ns ± 0% -8.32% (p=0.000 n=10+9) Sin 70.1ns ± 0% 69.6ns ± 0% -0.71% (p=0.000 n=10+10) Sincos 99.1ns ± 0% 99.6ns ± 0% +0.50% (p=0.000 n=9+10) SqrtGoLatency 178ns ± 0% 146ns ± 0% -17.70% (p=0.000 n=8+10) SqrtPrime 9.19µs ± 0% 9.20µs ± 0% +0.01% (p=0.000 n=9+9) Tanh 125ns ± 1% 127ns ± 0% +1.36% (p=0.000 n=10+10) Y0 428ns ± 0% 426ns ± 0% -0.47% (p=0.000 n=10+10) Y1 431ns ± 0% 429ns ± 0% -0.46% (p=0.000 n=10+9) Yn 906ns ± 0% 901ns ± 0% -0.55% (p=0.000 n=10+10) Float64bits 4.50ns ± 0% 3.50ns ± 0% -22.22% (p=0.000 n=10+10) Float64frombits 4.00ns ± 0% 3.50ns ± 0% -12.50% (p=0.000 n=10+9) Float32bits 4.50ns ± 0% 3.50ns ± 0% -22.22% (p=0.002 n=8+10) Float32frombits 4.00ns ± 0% 3.50ns ± 0% -12.50% (p=0.000 n=10+10) Change-Id: Iba829e15d5624962fe0c699139ea783efeefabc2 Reviewed-on: https://go-review.googlesource.com/129715 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
13de5e7f7f |
math/bits: add extended precision Add, Sub, Mul, Div
Port math/big pure go versions of add-with-carry, subtract-with-borrow, full-width multiply, and full-width divide. Updates #24813 Change-Id: Ifae5d2f6ee4237137c9dcba931f69c91b80a4b1c Reviewed-on: https://go-review.googlesource.com/123157 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
ded9411580 |
math: add Round and RoundToEven examples
Change-Id: Ibef5f96ea588d17eac1c96ee3992e01943ba0fef Reviewed-on: https://go-review.googlesource.com/131496 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> |
|
|
|
a3381faf81 |
math/big: streamline divLarge initialization
The divLarge code contained "todo"s about avoiding alias and clear calls in the initialization of variables. By rearranging the order of initialization and always using an auxiliary variable for the shifted divisor, all of these calls can be safely avoided. On average, normalizing the divisor (shift>0) is required 31/32 or 63/64 of the time. If one always performs the shift into an auxiliary variable first, this avoids the need to check for aliasing of vIn in the output variables u and z. The remainder u is initialized via a left shift of uIn and thus needs no alias check against uIn. Since uIn and vIn were both used, z needs no alias checks except against u which is used for storage of the remainder. This change has a minimal impact on performance (see below), but cleans up the initialization code and eliminates the "todo"s. name old time/op new time/op delta Div/20/10-4 86.7ns ± 6% 85.7ns ± 5% ~ (p=0.841 n=5+5) Div/200/100-4 523ns ± 5% 502ns ± 3% -4.13% (p=0.024 n=5+5) Div/2000/1000-4 2.55µs ± 3% 2.59µs ± 5% ~ (p=0.548 n=5+5) Div/20000/10000-4 80.4µs ± 4% 80.0µs ± 2% ~ (p=1.000 n=5+5) Div/200000/100000-4 6.43ms ± 6% 6.35ms ± 4% ~ (p=0.548 n=5+5) Fixes #22928 Change-Id: I30d8498ef1cf8b69b0f827165c517bc25a5c32d7 Reviewed-on: https://go-review.googlesource.com/130775 Reviewed-by: Robert Griesemer <gri@golang.org> |
|
|
|
3fd62ce910 |
math/big: optimize multiplication by 2 and 1/2 in float Sqrt
The Sqrt code previously used explicit constants for 2 and 1/2. This change replaces multiplication by these constants with increment and decrement of the floating point exponent directly. This improves performance by ~7-10% for small inputs and minimal improvement for large inputs. name old time/op new time/op delta FloatSqrt/64-4 1.39µs ± 0% 1.29µs ± 3% -7.01% (p=0.016 n=4+5) FloatSqrt/128-4 2.84µs ± 0% 2.60µs ± 1% -8.33% (p=0.008 n=5+5) FloatSqrt/256-4 3.24µs ± 1% 2.91µs ± 2% -10.00% (p=0.008 n=5+5) FloatSqrt/1000-4 7.42µs ± 1% 6.74µs ± 0% -9.16% (p=0.008 n=5+5) FloatSqrt/10000-4 65.9µs ± 1% 65.3µs ± 4% ~ (p=0.310 n=5+5) FloatSqrt/100000-4 1.57ms ± 8% 1.52ms ± 1% ~ (p=0.111 n=5+4) FloatSqrt/1000000-4 127ms ± 1% 126ms ± 1% ~ (p=0.690 n=5+5) Change-Id: Id81ac842a9d64981e001c4ca3ff129eebd227593 Reviewed-on: https://go-review.googlesource.com/130835 Reviewed-by: Robert Griesemer <gri@golang.org> |
|
|
|
1ae2eed0b2 |
math: test for pos/neg zero return of Ceil/Floor/Trunc
Ceil and Trunc of -0.2 return -0, not +0, but we didn't test that. Updates #23647 Change-Id: Idbd4699376abfb4ca93f16c73c114d610d86a9f2 Reviewed-on: https://go-review.googlesource.com/91335 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
0fbaf6ca8b |
math,net: omit explicit true tag expr in switch
Performed `switch true {}` => `switch {}` replacement.
Found using https://go-critic.github.io/overview.html#switchTrue-ref
Change-Id: Ib39ea98531651966a5a56b7bd729b46e4eeb7f7c
Reviewed-on: https://go-review.googlesource.com/123378
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
|
|
edae0ff8c1 |
math: use s390x mnemonics rather than binary encodings
TMLL, LGDR and LDGR have all been added to the Go assembler previously, so we don't need to encode them using WORD and BYTE directives anymore. This is purely a cosmetic change, it does not change the contents of any object files. Change-Id: I93f815b91be310858297d8a0dc9e6d8e3f09dd65 Reviewed-on: https://go-review.googlesource.com/129895 Run-TryBot: Michael Munday <mike.munday@ibm.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
669ac1228a |
math/rand: improve package documentation
Notify readers that interval notation is used. Fixes: #26765 Change-Id: Id02a7fcffbf41699e85631badeee083f5d4b2201 Reviewed-on: https://go-review.googlesource.com/127549 Reviewed-by: Rob Pike <r@golang.org> |
|
|
|
51ddeb9965 |
math: add tests for erf and erfc
Test large but not infinite arguments. This CL adds a test which breaks s390x. Don't submit until a fix for that is figured out. Update #26477 Change-Id: Ic86739fe3554e87d7f8e15482875c198fcf1d59c Reviewed-on: https://go-review.googlesource.com/125641 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
f04a002e5a |
math: ensure Erfc is not called with out-of-expected-range arguments on s390x
The existing implementation produces correct results with a wide range of inputs, but invalid results asymptotically. With this change we ensure correct asymptotic results on s390x Fixes #26477 Change-Id: I760c1f8177f7cab2d7622ab9a926dfb1f8113b49 Reviewed-on: https://go-review.googlesource.com/127119 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
30b045d4d1 |
math/big: handle negative exponents in Exp
For modular exponentiation, negative exponents can be handled using the following relation. for y < 0: x**y mod m == (x**(-1))**|y| mod m First compute ModInverse(x, m) and then compute the exponentiation with the absolute value of the exponent. Non-modular exponentiation with a negative exponent still returns 1. Fixes #25865 Change-Id: I2a35986a24794b48e549c8de935ac662d217d8a0 Reviewed-on: https://go-review.googlesource.com/118562 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org> |
|
|
|
d31cad7ca5 |
math/big: round x + (-x) to -0 for mode ToNegativeInf
Handling of sign bit as defined by IEEE 754-2008, section 6.3: When the sum of two operands with opposite signs (or the difference of two operands with like signs) is exactly zero, the sign of that sum (or difference) shall be +0 in all rounding-direction attributes except roundTowardNegative; under that attribute, the sign of an exact zero sum (or difference) shall be −0. However, x+x = x−(−x) retains the same sign as x even when x is zero. This change handles the special case of Add/Sub resulting in exactly zero when the rounding mode is ToNegativeInf setting the sign bit accordingly. Fixes #25798 Change-Id: I4d0715fa3c3e4a3d8a4d7861dc1d6423c8b1c68c Reviewed-on: https://go-review.googlesource.com/117495 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org> |
|
|
|
efddc161d2 |
math: add examples to Ceil, Floor, Pow, Pow10 functions
Change-Id: I9154df128b349c102854bb0f21e4c313685dd0e6 Reviewed-on: https://go-review.googlesource.com/118659 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
161874da2a |
all: update comment URLs from HTTP to HTTPS, where possible
Each URL was manually verified to ensure it did not serve up incorrect content. Change-Id: I4dc846227af95a73ee9a3074d0c379ff0fa955df Reviewed-on: https://go-review.googlesource.com/115798 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> |
|
|
|
85f4051731 |
math/big: implement Atkin's ModSqrt for 5 mod 8 primes
For primes congruent to 5 mod 8 there is a simple deterministic method for calculating the modular square root due to Atkin, using one exponentiation and 4 multiplications. A. Atkin. Probabilistic primality testing, summary by F. Morain. Research Report 1779, INRIA, pages 159–163, 1992. This increases the speed of modular square roots for these primes considerably. name old time/op new time/op delta ModSqrt231_5Mod8-4 1.03ms ± 2% 0.36ms ± 5% -65.06% (p=0.008 n=5+5) Change-Id: I024f6e514bbca8d634218983117db2afffe615fe Reviewed-on: https://go-review.googlesource.com/99615 Reviewed-by: Robert Griesemer <gri@golang.org> |
|
|
|
555eb70db2 |
all: regenerate stringer files
Change-Id: I34838320047792c4719837591e848b87ccb7f5ab Reviewed-on: https://go-review.googlesource.com/115058 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
1a5d0f83c9 |
math/big: reduce allocations in Karatsuba case of sqr
For #23221. Change-Id: If55dcf2e0706d6658f4a0863e3740437e008706c Reviewed-on: https://go-review.googlesource.com/114335 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org> |
|
|
|
1dc20e9124 |
math/big: specialize Karatsuba implementation for squaring
Currently we use three different algorithms for squaring: 1. basic multiplication for small numbers 2. basic squaring for medium numbers 3. Karatsuba multiplication for large numbers Change 3. to a version of Karatsuba multiplication specialized for x == y. Increasing the performance of 3. lets us lower the threshold between 2. and 3. Adapt TestCalibrate to the change that 3. isn't independent of the threshold between 1. and 2. any more. Fixes #23221. benchstat old.txt new.txt name old time/op new time/op delta NatSqr/1-4 29.6ns ± 7% 29.5ns ± 5% ~ (p=0.103 n=50+50) NatSqr/2-4 51.9ns ± 1% 51.9ns ± 1% ~ (p=0.693 n=42+49) NatSqr/3-4 64.3ns ± 1% 64.1ns ± 0% -0.26% (p=0.000 n=46+43) NatSqr/5-4 93.5ns ± 2% 93.1ns ± 1% -0.39% (p=0.000 n=48+49) NatSqr/8-4 131ns ± 1% 131ns ± 1% ~ (p=0.870 n=46+49) NatSqr/10-4 175ns ± 1% 175ns ± 1% +0.38% (p=0.000 n=49+47) NatSqr/20-4 426ns ± 1% 429ns ± 1% +0.84% (p=0.000 n=46+48) NatSqr/30-4 702ns ± 2% 699ns ± 1% -0.38% (p=0.011 n=46+44) NatSqr/50-4 1.44µs ± 2% 1.43µs ± 1% -0.54% (p=0.010 n=48+48) NatSqr/80-4 2.85µs ± 1% 2.87µs ± 1% +0.68% (p=0.000 n=47+47) NatSqr/100-4 4.06µs ± 1% 4.07µs ± 1% +0.29% (p=0.000 n=46+45) NatSqr/200-4 13.4µs ± 1% 13.5µs ± 1% +0.73% (p=0.000 n=48+48) NatSqr/300-4 28.5µs ± 1% 28.2µs ± 1% -1.22% (p=0.000 n=46+48) NatSqr/500-4 81.9µs ± 1% 67.0µs ± 1% -18.25% (p=0.000 n=48+48) NatSqr/800-4 161µs ± 1% 140µs ± 1% -13.29% (p=0.000 n=47+48) NatSqr/1000-4 245µs ± 1% 207µs ± 1% -15.17% (p=0.000 n=49+49) go test -v -calibrate --run TestCalibrate ... Calibrating threshold between basicSqr(x) and karatsubaSqr(x) Looking for a timing difference for x between 200 - 500 words by 10 step words = 200 deltaT = -980ns ( -7%) is karatsubaSqr(x) better: false words = 210 deltaT = -773ns ( -5%) is karatsubaSqr(x) better: false words = 220 deltaT = -695ns ( -4%) is karatsubaSqr(x) better: false words = 230 deltaT = -570ns ( -3%) is karatsubaSqr(x) better: false words = 240 deltaT = -458ns ( -2%) is karatsubaSqr(x) better: false words = 250 deltaT = -63ns ( 0%) is karatsubaSqr(x) better: false words = 260 deltaT = 118ns ( 0%) is karatsubaSqr(x) better: true threshold found words = 270 deltaT = 377ns ( 1%) is karatsubaSqr(x) better: true words = 280 deltaT = 765ns ( 3%) is karatsubaSqr(x) better: true words = 290 deltaT = 673ns ( 2%) is karatsubaSqr(x) better: true words = 300 deltaT = 502ns ( 1%) is karatsubaSqr(x) better: true words = 310 deltaT = 629ns ( 2%) is karatsubaSqr(x) better: true words = 320 deltaT = 1.011µs ( 3%) is karatsubaSqr(x) better: true words = 330 deltaT = 1.36µs ( 4%) is karatsubaSqr(x) better: true words = 340 deltaT = 3.001µs ( 8%) is karatsubaSqr(x) better: true words = 350 deltaT = 3.178µs ( 8%) is karatsubaSqr(x) better: true ... Change-Id: I6f13c23d94d042539ac28e77fd2618cdc37a429e Reviewed-on: https://go-review.googlesource.com/105075 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org> |
|
|
|
f94b5a8105 |
math/rand: clarify documentation for Seed example
Fixes #25325 Change-Id: I101641be99a820722edb7272918e04e8d2e1646c Reviewed-on: https://go-review.googlesource.com/112775 Reviewed-by: Rob Pike <r@golang.org> |
|
|
|
50649a967c |
math/big: implement Lehmer's extended GCD algorithm
Updates #15833 The extended GCD algorithm can be implemented using Lehmer's algorithm with additional updates for the cosequences following Algorithm 10.45 from Cohen et al. "Handbook of Elliptic and Hyperelliptic Curve Cryptography" pp 192. This brings the speed of the extended GCD calculation within ~2x of the base GCD calculation. There is a slight degradation in the non-extended GCD speed for small inputs (1-2 words) due to the additional code to handle the extended updates. name old time/op new time/op delta GCD10x10/WithoutXY-4 262ns ± 1% 266ns ± 2% ~ (p=0.333 n=5+5) GCD10x10/WithXY-4 1.42µs ± 2% 0.74µs ± 3% -47.90% (p=0.008 n=5+5) GCD10x100/WithoutXY-4 520ns ± 2% 539ns ± 1% +3.81% (p=0.008 n=5+5) GCD10x100/WithXY-4 2.32µs ± 1% 1.67µs ± 0% -27.80% (p=0.008 n=5+5) GCD10x1000/WithoutXY-4 1.40µs ± 1% 1.45µs ± 2% +3.26% (p=0.016 n=4+5) GCD10x1000/WithXY-4 4.78µs ± 1% 3.43µs ± 1% -28.37% (p=0.008 n=5+5) GCD10x10000/WithoutXY-4 10.0µs ± 0% 10.2µs ± 3% +1.80% (p=0.008 n=5+5) GCD10x10000/WithXY-4 20.9µs ± 3% 17.9µs ± 1% -14.20% (p=0.008 n=5+5) GCD10x100000/WithoutXY-4 96.8µs ± 0% 96.3µs ± 1% ~ (p=0.310 n=5+5) GCD10x100000/WithXY-4 196µs ± 3% 159µs ± 2% -18.61% (p=0.008 n=5+5) GCD100x100/WithoutXY-4 2.53µs ±15% 2.34µs ± 0% -7.35% (p=0.008 n=5+5) GCD100x100/WithXY-4 19.3µs ± 0% 3.9µs ± 1% -79.58% (p=0.008 n=5+5) GCD100x1000/WithoutXY-4 4.23µs ± 0% 4.17µs ± 3% ~ (p=0.127 n=5+5) GCD100x1000/WithXY-4 22.8µs ± 1% 7.5µs ±10% -67.00% (p=0.008 n=5+5) GCD100x10000/WithoutXY-4 19.1µs ± 0% 19.0µs ± 0% ~ (p=0.095 n=5+5) GCD100x10000/WithXY-4 75.1µs ± 2% 30.5µs ± 2% -59.38% (p=0.008 n=5+5) GCD100x100000/WithoutXY-4 170µs ± 5% 167µs ± 1% ~ (p=1.000 n=5+5) GCD100x100000/WithXY-4 542µs ± 2% 267µs ± 2% -50.79% (p=0.008 n=5+5) GCD1000x1000/WithoutXY-4 28.0µs ± 0% 27.1µs ± 0% -3.29% (p=0.008 n=5+5) GCD1000x1000/WithXY-4 329µs ± 0% 42µs ± 1% -87.12% (p=0.008 n=5+5) GCD1000x10000/WithoutXY-4 47.2µs ± 0% 46.4µs ± 0% -1.65% (p=0.016 n=5+4) GCD1000x10000/WithXY-4 607µs ± 9% 123µs ± 1% -79.70% (p=0.008 n=5+5) GCD1000x100000/WithoutXY-4 260µs ±17% 245µs ± 0% ~ (p=0.056 n=5+5) GCD1000x100000/WithXY-4 3.64ms ± 1% 0.93ms ± 1% -74.41% (p=0.016 n=4+5) GCD10000x10000/WithoutXY-4 513µs ± 0% 507µs ± 0% -1.22% (p=0.008 n=5+5) GCD10000x10000/WithXY-4 7.44ms ± 1% 1.00ms ± 0% -86.58% (p=0.008 n=5+5) GCD10000x100000/WithoutXY-4 1.23ms ± 0% 1.23ms ± 1% ~ (p=0.056 n=5+5) GCD10000x100000/WithXY-4 37.3ms ± 0% 7.3ms ± 1% -80.45% (p=0.008 n=5+5) GCD100000x100000/WithoutXY-4 24.2ms ± 0% 24.2ms ± 0% ~ (p=0.841 n=5+5) GCD100000x100000/WithXY-4 505ms ± 1% 56ms ± 1% -88.92% (p=0.008 n=5+5) Change-Id: I25f42ab8c55033acb83cc32bb03c12c1963925e8 Reviewed-on: https://go-review.googlesource.com/78755 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
b00f72e08a |
math, math/big: add wasm architecture
This commit adds the wasm architecture to the math package. Updates #18892 Change-Id: I5cc38552a31b193d35fb81ae87600a76b8b9e9b5 Reviewed-on: https://go-review.googlesource.com/106996 Reviewed-by: Cherry Zhang <cherryyz@google.com> |
|
|
|
8c4170b2c9 |
math/bits: move tests into their own package
This makes math/bits not have any explicit imports even when compiling tests and thereby avoids import cycles when dependencies of testing want to import math/bits. Change-Id: I95eccae2f5c4310e9b18124abfa85212dfbd9daa Reviewed-on: https://go-review.googlesource.com/110479 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
6f7ec484f6 |
math/big: handle negative moduli in ModInverse
Currently, there is no check for a negative modulus in ModInverse. Negative moduli are passed internally to GCD, which returns 0 for negative arguments. Mod is symmetric with respect to negative moduli, so the calculation can be done by just negating the modulus before passing the arguments to GCD. Fixes #24949 Change-Id: Ifd1e64c9b2343f0489c04ab65504e73a623378c7 Reviewed-on: https://go-review.googlesource.com/108115 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org> |
|
|
|
4d44a87243 |
math/big: return nil for nonexistent ModInverse
Currently, the behavior of z.ModInverse(g, n) is undefined when g and n are not relatively prime. In that case, no ModInverse exists which can be easily checked during the computation of the ModInverse. Because the ModInverse does not indicate whether the inverse exists, there are reimplementations of a "checked" ModInverse in crypto/rsa. This change removes the undefined behavior. If the ModInverse does not exist, the receiver z is unchanged and the return value is nil. This matches the behavior of ModSqrt for the case where the square root does not exist. name old time/op new time/op delta ModInverse-4 2.40µs ± 4% 2.22µs ± 0% -7.74% (p=0.016 n=5+4) name old alloc/op new alloc/op delta ModInverse-4 1.36kB ± 0% 1.17kB ± 0% -14.12% (p=0.008 n=5+5) name old allocs/op new allocs/op delta ModInverse-4 10.0 ± 0% 9.0 ± 0% -10.00% (p=0.008 n=5+5) Fixes #24922 Change-Id: If7f9d491858450bdb00f1e317152f02493c9c8a8 Reviewed-on: https://go-review.googlesource.com/108996 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org> |
|
|
|
6d5ebc7022 |
math: add a testcase for Mod and Remainder respectively
One might try to implement the Mod or Remainder function with the expression x - TRUNC(x/y + 0.5)*y, but in fact this method is wrong, because the rounding of (x/y + 0.5) to initialize the argument of TRUNC may lose too much precision. However, the current test cases can not detect this error. This CL adds two test cases to prevent people from continuing to do such attempts. Change-Id: I6690f5cffb21bf8ae06a314b7a45cafff8bcee13 Reviewed-on: https://go-review.googlesource.com/84275 Reviewed-by: Robert Griesemer <gri@golang.org> |
|
|
|
a2ffe3e625 |
math/rand: refactor rng.go
Made constant names more idiomatic, moved some constants to function seedrand, and found better name for _M. Change-Id: I192172f398378bef486a5bbceb6ba86af48ebcc9 Reviewed-on: https://go-review.googlesource.com/107135 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
b08a9b7ecc |
all: use new softfloat on GOARM=5
Use the new softfloat support in the compiler, originally added for softfloat on MIPS. This support is portable, so we can just use it for softfloat on ARM. In the old softfloat support on ARM, the compiler generates floating point instructions, then the assembler inserts calls to _sfloat before FP instructions. _sfloat decodes the following FP instructions and simulates them. In the new scheme, the compiler generates runtime calls to do FP operations at a higher level. It doesn't generate FP instructions, and therefore the assembler won't insert _sfloat calls, i.e. the old mechanism is automatically suppressed. The old method may be still be triggered with assembly code using FP instructions. In the standard library, the only occurance is math/sqrt_arm.s, which is rewritten to call to the Go implementation instead. Some significant speedups for code using floating points: name old time/op new time/op delta BinaryTree17-4 37.1s ± 2% 37.3s ± 1% ~ (p=0.105 n=10+10) Fannkuch11-4 13.0s ± 0% 13.1s ± 0% +0.46% (p=0.000 n=10+10) FmtFprintfEmpty-4 700ns ± 4% 734ns ± 6% +4.84% (p=0.009 n=10+10) FmtFprintfString-4 1.22µs ± 3% 1.22µs ± 4% ~ (p=0.897 n=10+10) FmtFprintfInt-4 1.27µs ± 2% 1.30µs ± 1% +1.91% (p=0.001 n=10+9) FmtFprintfIntInt-4 1.83µs ± 2% 1.81µs ± 3% ~ (p=0.149 n=10+10) FmtFprintfPrefixedInt-4 1.80µs ± 3% 1.81µs ± 2% ~ (p=0.421 n=10+8) FmtFprintfFloat-4 6.89µs ± 3% 3.59µs ± 2% -47.93% (p=0.000 n=10+10) FmtManyArgs-4 6.39µs ± 1% 6.09µs ± 1% -4.61% (p=0.000 n=10+9) GobDecode-4 109ms ± 2% 81ms ± 2% -25.99% (p=0.000 n=9+10) GobEncode-4 109ms ± 2% 76ms ± 2% -29.88% (p=0.000 n=10+9) Gzip-4 3.61s ± 1% 3.59s ± 1% ~ (p=0.247 n=10+10) Gunzip-4 449ms ± 4% 450ms ± 1% ~ (p=0.230 n=10+7) HTTPClientServer-4 1.55ms ± 3% 1.53ms ± 2% ~ (p=0.400 n=9+10) JSONEncode-4 356ms ± 1% 183ms ± 1% -48.73% (p=0.000 n=10+10) JSONDecode-4 1.12s ± 2% 0.87s ± 1% -21.88% (p=0.000 n=10+10) Mandelbrot200-4 5.49s ± 1% 2.55s ± 1% -53.45% (p=0.000 n=9+10) GoParse-4 49.6ms ± 2% 47.5ms ± 1% -4.08% (p=0.000 n=10+9) RegexpMatchEasy0_32-4 1.13µs ± 4% 1.20µs ± 4% +6.42% (p=0.000 n=10+10) RegexpMatchEasy0_1K-4 4.41µs ± 2% 4.44µs ± 2% ~ (p=0.128 n=10+10) RegexpMatchEasy1_32-4 1.15µs ± 5% 1.20µs ± 5% +4.85% (p=0.002 n=10+10) RegexpMatchEasy1_1K-4 6.21µs ± 2% 6.37µs ± 4% +2.62% (p=0.001 n=9+10) RegexpMatchMedium_32-4 1.58µs ± 5% 1.65µs ± 3% +4.85% (p=0.000 n=10+10) RegexpMatchMedium_1K-4 341µs ± 3% 351µs ± 7% ~ (p=0.573 n=8+10) RegexpMatchHard_32-4 21.4µs ± 3% 21.5µs ± 5% ~ (p=0.931 n=9+9) RegexpMatchHard_1K-4 626µs ± 2% 626µs ± 1% ~ (p=0.645 n=8+8) Revcomp-4 46.4ms ± 2% 47.4ms ± 2% +2.07% (p=0.000 n=10+10) Template-4 1.31s ± 3% 1.23s ± 4% -6.13% (p=0.000 n=10+10) TimeParse-4 4.49µs ± 1% 4.41µs ± 2% -1.81% (p=0.000 n=10+9) TimeFormat-4 9.31µs ± 1% 9.32µs ± 2% ~ (p=0.561 n=9+9) Change-Id: Iaeeff6c9a09c1b2c064d06e09dd88101dc02bfa4 Reviewed-on: https://go-review.googlesource.com/106735 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
7818b82fc8 |
math/big: clean up z.div(z, x, y) calls
Updates #22830 Due to not checking if the output slices alias in divLarge, calls of the form z.div(z, x, y) caused the slice z to attempt to be used to store both the quotient and the remainder of the division. CL 78995 applies an alias check to correct that error. This CL cleans up the additional div calls that attempt to supply the same slice to hold both the quotient and remainder. Note that the call in expNN was responsible for the reported error in r.Exp(x, 1, m) when r was initialized to a non-zero value. The second instance in expNNMontgomery did not result in an error due to the size of the arguments. // RR = 2**(2*_W*len(m)) mod m RR := nat(nil).setWord(1) zz := nat(nil).shl(RR, uint(2*numWords*_W)) _, RR = RR.div(RR, zz, m) Specifically, cap(RR) == 5 after setWord(1) due to const e = 4 in z.make(1) len(zz) == 2*len(m) + 1 after shifting left, numWords = len(m) Reusing the backing array for z and z2 in div was only triggered if cap(RR) >= len(zz) + 1 and len(m) > 1 so that divLarge was called. But, 5 < 2*len(m) + 2 if len(m) > 1, so new arrays were allocated and the error was never triggered in this case. Change-Id: Iedac80dbbde13216c94659e84d28f6f4be3aaf24 Reviewed-on: https://go-review.googlesource.com/81055 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org> |
|
|
|
fc8967e384 |
math/big: improve performance on ppc64x by unrolling loops
This change improves performance of addVV, subVV and mulAddVWW by unrolling the loops, with improvements up to 1.45x. benchmark old ns/op new ns/op delta BenchmarkAddVV/1-16 5.79 5.85 +1.04% BenchmarkAddVV/2-16 6.41 6.62 +3.28% BenchmarkAddVV/3-16 6.89 7.35 +6.68% BenchmarkAddVV/4-16 7.47 8.26 +10.58% BenchmarkAddVV/5-16 8.04 8.18 +1.74% BenchmarkAddVV/10-16 10.9 11.2 +2.75% BenchmarkAddVV/100-16 81.7 57.0 -30.23% BenchmarkAddVV/1000-16 714 500 -29.97% BenchmarkAddVV/10000-16 7088 4946 -30.22% BenchmarkAddVV/100000-16 71514 49364 -30.97% BenchmarkSubVV/1-16 5.94 5.89 -0.84% BenchmarkSubVV/2-16 12.9 6.82 -47.13% BenchmarkSubVV/3-16 7.03 7.34 +4.41% BenchmarkSubVV/4-16 7.58 8.23 +8.58% BenchmarkSubVV/5-16 8.15 8.19 +0.49% BenchmarkSubVV/10-16 11.2 11.4 +1.79% BenchmarkSubVV/100-16 82.4 57.0 -30.83% BenchmarkSubVV/1000-16 715 499 -30.21% BenchmarkSubVV/10000-16 7089 4947 -30.22% BenchmarkSubVV/100000-16 71568 49378 -31.01% benchmark old MB/s new MB/s speedup BenchmarkAddVV/1-16 11048.49 10939.92 0.99x BenchmarkAddVV/2-16 19973.41 19323.60 0.97x BenchmarkAddVV/3-16 27847.09 26123.06 0.94x BenchmarkAddVV/4-16 34276.46 30976.54 0.90x BenchmarkAddVV/5-16 39781.92 39140.68 0.98x BenchmarkAddVV/10-16 58559.29 56894.68 0.97x BenchmarkAddVV/100-16 78354.88 112243.69 1.43x BenchmarkAddVV/1000-16 89592.74 127889.04 1.43x BenchmarkAddVV/10000-16 90292.39 129387.06 1.43x BenchmarkAddVV/100000-16 89492.92 129647.78 1.45x BenchmarkSubVV/1-16 10781.03 10861.22 1.01x BenchmarkSubVV/2-16 9949.27 18760.21 1.89x BenchmarkSubVV/3-16 27319.40 26166.01 0.96x BenchmarkSubVV/4-16 33764.35 31123.02 0.92x BenchmarkSubVV/5-16 39272.40 39050.31 0.99x BenchmarkSubVV/10-16 57262.87 56206.33 0.98x BenchmarkSubVV/100-16 77641.78 112280.86 1.45x BenchmarkSubVV/1000-16 89486.27 128064.08 1.43x BenchmarkSubVV/10000-16 90274.37 129356.59 1.43x BenchmarkSubVV/100000-16 89424.42 129610.50 1.45x Change-Id: I2795a82134d1e3b75e2634c76b8ca165a723ec7b Reviewed-on: https://go-review.googlesource.com/103495 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> |
|
|
|
542ea5ad91 |
go/printer, gofmt: tuned table alignment for better results
The go/printer (and thus gofmt) uses a heuristic to determine whether to break alignment between elements of an expression list which is spread across multiple lines. The heuristic only kicked in if the entry sizes (character length) was above a certain threshold (20) and the ratio between the previous and current entry size was above a certain value (4). This heuristic worked reasonably most of the time, but also led to unfortunate breaks in many cases where a single entry was suddenly much smaller (or larger) then the previous one. The behavior of gofmt was sufficiently mysterious in some of these situations that many issues were filed against it. The simplest solution to address this problem is to remove the heuristic altogether and have a programmer introduce empty lines to force different alignments if it improves readability. The problem with that approach is that the places where it really matters, very long tables with many (hundreds, or more) entries, may be machine-generated and not "post-processed" by a human (e.g., unicode/utf8/tables.go). If a single one of those entries is overlong, the result would be that the alignment would force all comments or values in key:value pairs to be adjusted to that overlong value, making the table hard to read (e.g., that entry may not even be visible on screen and all other entries seem spaced out too wide). Instead, we opted for a slightly improved heuristic that behaves much better for "normal", human-written code. 1) The threshold is increased from 20 to 40. This disables the heuristic for many common cases yet even if the alignment is not "ideal", 40 is not that many characters per line with todays screens, making it very likely that the entire line remains "visible" in an editor. 2) Changed the heuristic to not simply look at the size ratio between current and previous line, but instead considering the geometric mean of the sizes of the previous (aligned) lines. This emphasizes the "overall picture" of the previous lines, rather than a single one (which might be an outlier). 3) Changed the ratio from 4 to 2.5. Now that we ignore sizes below 40, a ratio of 4 would mean that a new entry would have to be 4 times bigger (160) or smaller (10) before alignment would be broken. A ratio of 2.5 seems more sensible. Applied updated gofmt to all of src and misc. Also tested against several former issues that complained about this and verified that the output for the given examples is satisfactory (added respective test cases). Some of the files changed because they were not gofmt-ed in the first place. For #644. For #7335. For #10392. (and probably more related issues) Fixes #22852. Change-Id: I5e48b3d3b157a5cf2d649833b7297b33f43a6f6e |
|
|
|
02952ad7a8 |
math/big: remove "else" from if with block that ends with return
That "else" was needed due to gc DCE limitations. Now it's not the case and we can avoid go lint complaints. (See #23521 and https://golang.org/cl/91056.) There is inlining test for bigEndianWord, so if test is passing, no performance regression should occur. Change-Id: Id84d63f361e5e51a52293904ff042966c83c16e9 Reviewed-on: https://go-review.googlesource.com/104555 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> |
|
|
|
32e6461dc6 |
cmd/asm, math: add s390x floating point test instructions
Floating point test instructions allow special cases (NaN, ±∞ and a few other useful properties) to be checked directly. This CL adds the following instructions to the assembler: * LTEBR - load and test (float32) * LTDBR - load and test (float64) * TCEB - test data class (float32) * TCDB - test data class (float64) Note that I have only added immediate versions of the 'test data class' instructions for now as that's the only case I think the compiler will use. Change-Id: I3398aab2b3a758bf909bd158042234030c8af582 Reviewed-on: https://go-review.googlesource.com/104457 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
4b265fb747 |
math: fix Ldexp when result is below ldexp(2, -1075)
Before this change, the smallest result Ldexp can handle was ldexp(2, -1075), which is SmallestNonzeroFloat64. There are some numbers below it should also be rounded to SmallestNonzeroFloat64. The change fixes this. Fixes #23407 Change-Id: I76f4cb005a6e9ccdd95b5e5c734079fd5d29e4aa Reviewed-on: https://go-review.googlesource.com/87338 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org> |
|
|
|
7fe2f549cc |
math: handle denormals in AMD64 Exp
Fixes #23164 Change-Id: I6e8c6443f3ef91df71e117cce1cfa1faba647dd7 Reviewed-on: https://go-review.googlesource.com/87337 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
|
|
|
711a373cc3 |
math: optimize Exp and Exp2 on arm64
This CL implements Exp and Exp2 with arm64 assembly. By inlining Ldexp and using fused instructions(fmadd, fmsub, fnmsub), this CL helps to improve the performance of functions Exp, Exp2, Sinh, Cosh and Tanh. Benchmarks: name old time/op new time/op delta Cosh-8 138ns ± 0% 96ns ± 0% -30.72% (p=0.008 n=5+5) Exp-8 105ns ± 0% 58ns ± 0% -45.24% (p=0.000 n=5+4) Exp2-8 100ns ± 0% 57ns ± 0% -43.21% (p=0.008 n=5+5) Sinh-8 139ns ± 0% 102ns ± 0% -26.62% (p=0.008 n=5+5) Tanh-8 134ns ± 0% 100ns ± 0% -25.67% (p=0.008 n=5+5) Change-Id: I7483a3333062a1d3525cedf3de56db78d79031c6 Reviewed-on: https://go-review.googlesource.com/86615 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> |