mirror/go - go - Git Fam. Sieh

Commit Graph

Author	SHA1	Message	Date
Brad Fitzpatrick	3813edf26e	all: use "reports whether" consistently in the few places that didn't Go documentation style for boolean funcs is to say: // Foo reports whether ... func Foo() bool (rather than "returns true if") This CL also replaces 4 uses of "iff" with the same "reports whether" wording, which doesn't lose any meaning, and will prevent people from sending typo fixes when they don't realize it's "if and only if". In the past I think we've had the typo CLs updated to just say "reports whether". So do them all at once. (Inspired by the addition of another "returns true if" in CL 146938 in fd_plan9.go) Created with: $ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true iff" \| grep -v vendor) $ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true if" \| grep -v vendor) Change-Id: Ided502237f5ab0d25cb625dbab12529c361a8b9f Reviewed-on: https://go-review.googlesource.com/c/147037 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-11-02 22:47:58 +00:00
Robert Griesemer	c86d464734	math/big: shallow copies of Int/Rat/Float are not supported (documentation) Fixes #28423. Change-Id: Ie57ade565d0407a4bffaa86fb4475ff083168e79 Reviewed-on: https://go-review.googlesource.com/c/145537 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-10-29 18:23:31 +00:00
hearot	f28191340e	math/big: fix a formula used as documentation The function documentation was wrong, it was using a wrong parameter. This change replaces it with the right parameter. The wrong formula was: q = (u1<<_W + u0 - r)/y The function has got a parameter "v" (of type Word), not a parameter "y". So, the right formula is: q = (u1<<_W + u0 - r)/v Fixes #28444 Change-Id: I82e57ba014735a9fdb6262874ddf498754d30d33 Reviewed-on: https://go-review.googlesource.com/c/145280 Reviewed-by: Robert Griesemer <gri@golang.org>	2018-10-28 16:58:20 +00:00
Keith Randall	899f3a2892	cmd/compile: intrinsify math/bits.Add on amd64 name old time/op new time/op delta Add-8 1.11ns ± 0% 1.18ns ± 0% +6.31% (p=0.029 n=4+4) Add32-8 1.02ns ± 0% 1.02ns ± 1% ~ (p=0.333 n=4+5) Add64-8 1.11ns ± 1% 1.17ns ± 0% +5.79% (p=0.008 n=5+5) Add64multiple-8 4.35ns ± 1% 0.86ns ± 0% -80.22% (p=0.000 n=5+4) The individual ops are a bit slower (but still very fast). Using the ops in carry chains is very fast. Update #28273 Change-Id: Id975f76df2b930abf0e412911d327b6c5b1befe5 Reviewed-on: https://go-review.googlesource.com/c/144257 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-10-25 19:47:00 +00:00
Brian Kessler	127c51e48c	math/bits: correct BenchmarkSub64 Previously, the benchmark was measuring Add64 instead of Sub64. Change-Id: I0cf30935c8a4728bead9868834377aae0b34f008 Reviewed-on: https://go-review.googlesource.com/c/144380 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-10-24 14:53:19 +00:00
Igor Zhilianin	f90e89e675	all: fix a bunch of misspellings Change-Id: If2954bdfc551515403706b2cd0dde94e45936e08 GitHub-Last-Rev: `d4cfc41a55` GitHub-Pull-Request: golang/go#28049 Reviewed-on: https://go-review.googlesource.com/c/140299 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-10-06 15:40:03 +00:00
Plekhanov Maxim	47e71f3b69	math: use Abs in Pow rather than if x < 0 { x = -x } name old time/op new time/op delta PowInt 55.7ns ± 1% 53.4ns ± 2% -4.15% (p=0.000 n=9+9) PowFrac 133ns ± 1% 133ns ± 2% ~ (p=0.587 n=8+9) Change-Id: Ica0f4c2cbd554f2195c6d1762ed26742ff8e3924 Reviewed-on: https://go-review.googlesource.com/c/85375 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-10-04 17:33:04 +00:00
Plekhanov Maxim	497d24178f	math: use Abs in Mod rather than if x < 0 { x = -x} goos: linux goarch: amd64 pkg: math name old time/op new time/op delta Mod 64.7ns ± 2% 63.7ns ± 2% -1.52% (p=0.003 n=8+10) Change-Id: I851bec0fd6c223dab73e4a680b7393d49e81a0e8 Reviewed-on: https://go-review.googlesource.com/c/85095 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-10-04 17:32:44 +00:00
Zhou Peng	b8ac64a581	all: this big patch remove whitespace from assembly files Don't worry, this patch just remove trailing whitespace from assembly files, and does not touch any logical changes. Change-Id: Ia724ac0b1abf8bc1e41454bdc79289ef317c165d Reviewed-on: https://go-review.googlesource.com/c/113595 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-10-03 15:28:51 +00:00
fanzha02	a19a83c8ef	cmd/compile: optimize math.Float64(32)bits and math.Float64(32)frombits on arm64 Use float <-> int register moves without conversion instead of stores and loads to move float <-> int values. Math package benchmark results. name old time/op new time/op delta Acosh 153ns ± 0% 147ns ± 0% -3.92% (p=0.000 n=10+10) Asinh 183ns ± 0% 177ns ± 0% -3.28% (p=0.000 n=10+10) Atanh 157ns ± 0% 155ns ± 0% -1.27% (p=0.000 n=10+10) Atan2 118ns ± 0% 117ns ± 1% -0.59% (p=0.003 n=10+10) Cbrt 119ns ± 0% 114ns ± 0% -4.20% (p=0.000 n=10+10) Copysign 7.51ns ± 0% 6.51ns ± 0% -13.32% (p=0.000 n=9+10) Cos 73.1ns ± 0% 70.6ns ± 0% -3.42% (p=0.000 n=10+10) Cosh 119ns ± 0% 121ns ± 0% +1.68% (p=0.000 n=10+9) ExpGo 154ns ± 0% 149ns ± 0% -3.05% (p=0.000 n=9+10) Expm1 101ns ± 0% 99ns ± 0% -1.88% (p=0.000 n=10+10) Exp2Go 150ns ± 0% 146ns ± 0% -2.67% (p=0.000 n=10+10) Abs 7.01ns ± 0% 6.01ns ± 0% -14.27% (p=0.000 n=10+9) Mod 234ns ± 0% 212ns ± 0% -9.40% (p=0.000 n=9+10) Frexp 34.5ns ± 0% 30.0ns ± 0% -13.04% (p=0.000 n=10+10) Gamma 112ns ± 0% 111ns ± 0% -0.89% (p=0.000 n=10+10) Hypot 73.6ns ± 0% 68.6ns ± 0% -6.79% (p=0.000 n=10+10) HypotGo 77.1ns ± 0% 72.1ns ± 0% -6.49% (p=0.000 n=10+10) Ilogb 31.0ns ± 0% 28.0ns ± 0% -9.68% (p=0.000 n=10+10) J0 437ns ± 0% 434ns ± 0% -0.62% (p=0.000 n=10+10) J1 433ns ± 0% 431ns ± 0% -0.46% (p=0.000 n=10+10) Jn 927ns ± 0% 922ns ± 0% -0.54% (p=0.000 n=10+10) Ldexp 41.5ns ± 0% 37.0ns ± 0% -10.84% (p=0.000 n=9+10) Log 124ns ± 0% 118ns ± 0% -4.84% (p=0.000 n=10+9) Logb 34.0ns ± 0% 32.0ns ± 0% -5.88% (p=0.000 n=10+10) Log1p 110ns ± 0% 108ns ± 0% -1.82% (p=0.000 n=10+10) Log10 136ns ± 0% 132ns ± 0% -2.94% (p=0.000 n=10+10) Log2 51.6ns ± 0% 47.1ns ± 0% -8.72% (p=0.000 n=10+10) Nextafter32 33.0ns ± 0% 30.5ns ± 0% -7.58% (p=0.000 n=10+10) Nextafter64 29.0ns ± 0% 26.5ns ± 0% -8.62% (p=0.000 n=10+10) PowInt 169ns ± 0% 160ns ± 0% -5.33% (p=0.000 n=10+10) PowFrac 375ns ± 0% 361ns ± 0% -3.73% (p=0.000 n=10+10) RoundToEven 14.0ns ± 0% 12.5ns ± 0% -10.71% (p=0.000 n=10+10) Remainder 206ns ± 0% 192ns ± 0% -6.80% (p=0.000 n=10+9) Signbit 6.01ns ± 0% 5.51ns ± 0% -8.32% (p=0.000 n=10+9) Sin 70.1ns ± 0% 69.6ns ± 0% -0.71% (p=0.000 n=10+10) Sincos 99.1ns ± 0% 99.6ns ± 0% +0.50% (p=0.000 n=9+10) SqrtGoLatency 178ns ± 0% 146ns ± 0% -17.70% (p=0.000 n=8+10) SqrtPrime 9.19µs ± 0% 9.20µs ± 0% +0.01% (p=0.000 n=9+9) Tanh 125ns ± 1% 127ns ± 0% +1.36% (p=0.000 n=10+10) Y0 428ns ± 0% 426ns ± 0% -0.47% (p=0.000 n=10+10) Y1 431ns ± 0% 429ns ± 0% -0.46% (p=0.000 n=10+9) Yn 906ns ± 0% 901ns ± 0% -0.55% (p=0.000 n=10+10) Float64bits 4.50ns ± 0% 3.50ns ± 0% -22.22% (p=0.000 n=10+10) Float64frombits 4.00ns ± 0% 3.50ns ± 0% -12.50% (p=0.000 n=10+9) Float32bits 4.50ns ± 0% 3.50ns ± 0% -22.22% (p=0.002 n=8+10) Float32frombits 4.00ns ± 0% 3.50ns ± 0% -12.50% (p=0.000 n=10+10) Change-Id: Iba829e15d5624962fe0c699139ea783efeefabc2 Reviewed-on: https://go-review.googlesource.com/129715 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-09-17 20:49:04 +00:00
Brian Kessler	13de5e7f7f	math/bits: add extended precision Add, Sub, Mul, Div Port math/big pure go versions of add-with-carry, subtract-with-borrow, full-width multiply, and full-width divide. Updates #24813 Change-Id: Ifae5d2f6ee4237137c9dcba931f69c91b80a4b1c Reviewed-on: https://go-review.googlesource.com/123157 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-09-11 17:54:20 +00:00
Eric Ponce	ded9411580	math: add Round and RoundToEven examples Change-Id: Ibef5f96ea588d17eac1c96ee3992e01943ba0fef Reviewed-on: https://go-review.googlesource.com/131496 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-08-28 05:22:41 +00:00
Brian Kessler	a3381faf81	math/big: streamline divLarge initialization The divLarge code contained "todo"s about avoiding alias and clear calls in the initialization of variables. By rearranging the order of initialization and always using an auxiliary variable for the shifted divisor, all of these calls can be safely avoided. On average, normalizing the divisor (shift>0) is required 31/32 or 63/64 of the time. If one always performs the shift into an auxiliary variable first, this avoids the need to check for aliasing of vIn in the output variables u and z. The remainder u is initialized via a left shift of uIn and thus needs no alias check against uIn. Since uIn and vIn were both used, z needs no alias checks except against u which is used for storage of the remainder. This change has a minimal impact on performance (see below), but cleans up the initialization code and eliminates the "todo"s. name old time/op new time/op delta Div/20/10-4 86.7ns ± 6% 85.7ns ± 5% ~ (p=0.841 n=5+5) Div/200/100-4 523ns ± 5% 502ns ± 3% -4.13% (p=0.024 n=5+5) Div/2000/1000-4 2.55µs ± 3% 2.59µs ± 5% ~ (p=0.548 n=5+5) Div/20000/10000-4 80.4µs ± 4% 80.0µs ± 2% ~ (p=1.000 n=5+5) Div/200000/100000-4 6.43ms ± 6% 6.35ms ± 4% ~ (p=0.548 n=5+5) Fixes #22928 Change-Id: I30d8498ef1cf8b69b0f827165c517bc25a5c32d7 Reviewed-on: https://go-review.googlesource.com/130775 Reviewed-by: Robert Griesemer <gri@golang.org>	2018-08-22 22:54:01 +00:00
Brian Kessler	3fd62ce910	math/big: optimize multiplication by 2 and 1/2 in float Sqrt The Sqrt code previously used explicit constants for 2 and 1/2. This change replaces multiplication by these constants with increment and decrement of the floating point exponent directly. This improves performance by ~7-10% for small inputs and minimal improvement for large inputs. name old time/op new time/op delta FloatSqrt/64-4 1.39µs ± 0% 1.29µs ± 3% -7.01% (p=0.016 n=4+5) FloatSqrt/128-4 2.84µs ± 0% 2.60µs ± 1% -8.33% (p=0.008 n=5+5) FloatSqrt/256-4 3.24µs ± 1% 2.91µs ± 2% -10.00% (p=0.008 n=5+5) FloatSqrt/1000-4 7.42µs ± 1% 6.74µs ± 0% -9.16% (p=0.008 n=5+5) FloatSqrt/10000-4 65.9µs ± 1% 65.3µs ± 4% ~ (p=0.310 n=5+5) FloatSqrt/100000-4 1.57ms ± 8% 1.52ms ± 1% ~ (p=0.111 n=5+4) FloatSqrt/1000000-4 127ms ± 1% 126ms ± 1% ~ (p=0.690 n=5+5) Change-Id: Id81ac842a9d64981e001c4ca3ff129eebd227593 Reviewed-on: https://go-review.googlesource.com/130835 Reviewed-by: Robert Griesemer <gri@golang.org>	2018-08-22 21:02:21 +00:00
Ian Lance Taylor	1ae2eed0b2	math: test for pos/neg zero return of Ceil/Floor/Trunc Ceil and Trunc of -0.2 return -0, not +0, but we didn't test that. Updates #23647 Change-Id: Idbd4699376abfb4ca93f16c73c114d610d86a9f2 Reviewed-on: https://go-review.googlesource.com/91335 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-08-21 22:31:37 +00:00
Iskander Sharipov	0fbaf6ca8b	math,net: omit explicit true tag expr in switch Performed `switch true {}` => `switch {}` replacement. Found using https://go-critic.github.io/overview.html#switchTrue-ref Change-Id: Ib39ea98531651966a5a56b7bd729b46e4eeb7f7c Reviewed-on: https://go-review.googlesource.com/123378 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-08-20 22:15:59 +00:00
Michael Munday	edae0ff8c1	math: use s390x mnemonics rather than binary encodings TMLL, LGDR and LDGR have all been added to the Go assembler previously, so we don't need to encode them using WORD and BYTE directives anymore. This is purely a cosmetic change, it does not change the contents of any object files. Change-Id: I93f815b91be310858297d8a0dc9e6d8e3f09dd65 Reviewed-on: https://go-review.googlesource.com/129895 Run-TryBot: Michael Munday <mike.munday@ibm.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-08-20 17:42:08 +00:00
Benjamin Cable	669ac1228a	math/rand: improve package documentation Notify readers that interval notation is used. Fixes: #26765 Change-Id: Id02a7fcffbf41699e85631badeee083f5d4b2201 Reviewed-on: https://go-review.googlesource.com/127549 Reviewed-by: Rob Pike <r@golang.org>	2018-08-03 23:08:42 +00:00
Keith Randall	51ddeb9965	math: add tests for erf and erfc Test large but not infinite arguments. This CL adds a test which breaks s390x. Don't submit until a fix for that is figured out. Update #26477 Change-Id: Ic86739fe3554e87d7f8e15482875c198fcf1d59c Reviewed-on: https://go-review.googlesource.com/125641 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-08-03 03:38:52 +00:00
bill_ofarrell	f04a002e5a	math: ensure Erfc is not called with out-of-expected-range arguments on s390x The existing implementation produces correct results with a wide range of inputs, but invalid results asymptotically. With this change we ensure correct asymptotic results on s390x Fixes #26477 Change-Id: I760c1f8177f7cab2d7622ab9a926dfb1f8113b49 Reviewed-on: https://go-review.googlesource.com/127119 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-08-03 01:21:31 +00:00
Brian Kessler	30b045d4d1	math/big: handle negative exponents in Exp For modular exponentiation, negative exponents can be handled using the following relation. for y < 0: xy mod m == (x(-1))**\|y\| mod m First compute ModInverse(x, m) and then compute the exponentiation with the absolute value of the exponent. Non-modular exponentiation with a negative exponent still returns 1. Fixes #25865 Change-Id: I2a35986a24794b48e549c8de935ac662d217d8a0 Reviewed-on: https://go-review.googlesource.com/118562 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-06-14 22:26:30 +00:00
Brian Kessler	d31cad7ca5	math/big: round x + (-x) to -0 for mode ToNegativeInf Handling of sign bit as defined by IEEE 754-2008, section 6.3: When the sum of two operands with opposite signs (or the difference of two operands with like signs) is exactly zero, the sign of that sum (or difference) shall be +0 in all rounding-direction attributes except roundTowardNegative; under that attribute, the sign of an exact zero sum (or difference) shall be −0. However, x+x = x−(−x) retains the same sign as x even when x is zero. This change handles the special case of Add/Sub resulting in exactly zero when the rounding mode is ToNegativeInf setting the sign bit accordingly. Fixes #25798 Change-Id: I4d0715fa3c3e4a3d8a4d7861dc1d6423c8b1c68c Reviewed-on: https://go-review.googlesource.com/117495 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-06-14 21:07:01 +00:00
Andrii Soldatenko	efddc161d2	math: add examples to Ceil, Floor, Pow, Pow10 functions Change-Id: I9154df128b349c102854bb0f21e4c313685dd0e6 Reviewed-on: https://go-review.googlesource.com/118659 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-06-13 22:01:28 +00:00
Tim Cooper	161874da2a	all: update comment URLs from HTTP to HTTPS, where possible Each URL was manually verified to ensure it did not serve up incorrect content. Change-Id: I4dc846227af95a73ee9a3074d0c379ff0fa955df Reviewed-on: https://go-review.googlesource.com/115798 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>	2018-06-01 21:52:00 +00:00
Brian Kessler	85f4051731	math/big: implement Atkin's ModSqrt for 5 mod 8 primes For primes congruent to 5 mod 8 there is a simple deterministic method for calculating the modular square root due to Atkin, using one exponentiation and 4 multiplications. A. Atkin. Probabilistic primality testing, summary by F. Morain. Research Report 1779, INRIA, pages 159–163, 1992. This increases the speed of modular square roots for these primes considerably. name old time/op new time/op delta ModSqrt231_5Mod8-4 1.03ms ± 2% 0.36ms ± 5% -65.06% (p=0.008 n=5+5) Change-Id: I024f6e514bbca8d634218983117db2afffe615fe Reviewed-on: https://go-review.googlesource.com/99615 Reviewed-by: Robert Griesemer <gri@golang.org>	2018-05-30 16:28:46 +00:00
Tim Cooper	555eb70db2	all: regenerate stringer files Change-Id: I34838320047792c4719837591e848b87ccb7f5ab Reviewed-on: https://go-review.googlesource.com/115058 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-05-29 20:35:41 +00:00
Alexander Döring	1a5d0f83c9	math/big: reduce allocations in Karatsuba case of sqr For #23221. Change-Id: If55dcf2e0706d6658f4a0863e3740437e008706c Reviewed-on: https://go-review.googlesource.com/114335 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-05-23 23:01:10 +00:00
Alexander Döring	1dc20e9124	math/big: specialize Karatsuba implementation for squaring Currently we use three different algorithms for squaring: 1. basic multiplication for small numbers 2. basic squaring for medium numbers 3. Karatsuba multiplication for large numbers Change 3. to a version of Karatsuba multiplication specialized for x == y. Increasing the performance of 3. lets us lower the threshold between 2. and 3. Adapt TestCalibrate to the change that 3. isn't independent of the threshold between 1. and 2. any more. Fixes #23221. benchstat old.txt new.txt name old time/op new time/op delta NatSqr/1-4 29.6ns ± 7% 29.5ns ± 5% ~ (p=0.103 n=50+50) NatSqr/2-4 51.9ns ± 1% 51.9ns ± 1% ~ (p=0.693 n=42+49) NatSqr/3-4 64.3ns ± 1% 64.1ns ± 0% -0.26% (p=0.000 n=46+43) NatSqr/5-4 93.5ns ± 2% 93.1ns ± 1% -0.39% (p=0.000 n=48+49) NatSqr/8-4 131ns ± 1% 131ns ± 1% ~ (p=0.870 n=46+49) NatSqr/10-4 175ns ± 1% 175ns ± 1% +0.38% (p=0.000 n=49+47) NatSqr/20-4 426ns ± 1% 429ns ± 1% +0.84% (p=0.000 n=46+48) NatSqr/30-4 702ns ± 2% 699ns ± 1% -0.38% (p=0.011 n=46+44) NatSqr/50-4 1.44µs ± 2% 1.43µs ± 1% -0.54% (p=0.010 n=48+48) NatSqr/80-4 2.85µs ± 1% 2.87µs ± 1% +0.68% (p=0.000 n=47+47) NatSqr/100-4 4.06µs ± 1% 4.07µs ± 1% +0.29% (p=0.000 n=46+45) NatSqr/200-4 13.4µs ± 1% 13.5µs ± 1% +0.73% (p=0.000 n=48+48) NatSqr/300-4 28.5µs ± 1% 28.2µs ± 1% -1.22% (p=0.000 n=46+48) NatSqr/500-4 81.9µs ± 1% 67.0µs ± 1% -18.25% (p=0.000 n=48+48) NatSqr/800-4 161µs ± 1% 140µs ± 1% -13.29% (p=0.000 n=47+48) NatSqr/1000-4 245µs ± 1% 207µs ± 1% -15.17% (p=0.000 n=49+49) go test -v -calibrate --run TestCalibrate ... Calibrating threshold between basicSqr(x) and karatsubaSqr(x) Looking for a timing difference for x between 200 - 500 words by 10 step words = 200 deltaT = -980ns ( -7%) is karatsubaSqr(x) better: false words = 210 deltaT = -773ns ( -5%) is karatsubaSqr(x) better: false words = 220 deltaT = -695ns ( -4%) is karatsubaSqr(x) better: false words = 230 deltaT = -570ns ( -3%) is karatsubaSqr(x) better: false words = 240 deltaT = -458ns ( -2%) is karatsubaSqr(x) better: false words = 250 deltaT = -63ns ( 0%) is karatsubaSqr(x) better: false words = 260 deltaT = 118ns ( 0%) is karatsubaSqr(x) better: true threshold found words = 270 deltaT = 377ns ( 1%) is karatsubaSqr(x) better: true words = 280 deltaT = 765ns ( 3%) is karatsubaSqr(x) better: true words = 290 deltaT = 673ns ( 2%) is karatsubaSqr(x) better: true words = 300 deltaT = 502ns ( 1%) is karatsubaSqr(x) better: true words = 310 deltaT = 629ns ( 2%) is karatsubaSqr(x) better: true words = 320 deltaT = 1.011µs ( 3%) is karatsubaSqr(x) better: true words = 330 deltaT = 1.36µs ( 4%) is karatsubaSqr(x) better: true words = 340 deltaT = 3.001µs ( 8%) is karatsubaSqr(x) better: true words = 350 deltaT = 3.178µs ( 8%) is karatsubaSqr(x) better: true ... Change-Id: I6f13c23d94d042539ac28e77fd2618cdc37a429e Reviewed-on: https://go-review.googlesource.com/105075 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-05-23 20:46:02 +00:00
Agniva De Sarker	f94b5a8105	math/rand: clarify documentation for Seed example Fixes #25325 Change-Id: I101641be99a820722edb7272918e04e8d2e1646c Reviewed-on: https://go-review.googlesource.com/112775 Reviewed-by: Rob Pike <r@golang.org>	2018-05-12 06:21:01 +00:00
Brian Kessler	50649a967c	math/big: implement Lehmer's extended GCD algorithm Updates #15833 The extended GCD algorithm can be implemented using Lehmer's algorithm with additional updates for the cosequences following Algorithm 10.45 from Cohen et al. "Handbook of Elliptic and Hyperelliptic Curve Cryptography" pp 192. This brings the speed of the extended GCD calculation within ~2x of the base GCD calculation. There is a slight degradation in the non-extended GCD speed for small inputs (1-2 words) due to the additional code to handle the extended updates. name old time/op new time/op delta GCD10x10/WithoutXY-4 262ns ± 1% 266ns ± 2% ~ (p=0.333 n=5+5) GCD10x10/WithXY-4 1.42µs ± 2% 0.74µs ± 3% -47.90% (p=0.008 n=5+5) GCD10x100/WithoutXY-4 520ns ± 2% 539ns ± 1% +3.81% (p=0.008 n=5+5) GCD10x100/WithXY-4 2.32µs ± 1% 1.67µs ± 0% -27.80% (p=0.008 n=5+5) GCD10x1000/WithoutXY-4 1.40µs ± 1% 1.45µs ± 2% +3.26% (p=0.016 n=4+5) GCD10x1000/WithXY-4 4.78µs ± 1% 3.43µs ± 1% -28.37% (p=0.008 n=5+5) GCD10x10000/WithoutXY-4 10.0µs ± 0% 10.2µs ± 3% +1.80% (p=0.008 n=5+5) GCD10x10000/WithXY-4 20.9µs ± 3% 17.9µs ± 1% -14.20% (p=0.008 n=5+5) GCD10x100000/WithoutXY-4 96.8µs ± 0% 96.3µs ± 1% ~ (p=0.310 n=5+5) GCD10x100000/WithXY-4 196µs ± 3% 159µs ± 2% -18.61% (p=0.008 n=5+5) GCD100x100/WithoutXY-4 2.53µs ±15% 2.34µs ± 0% -7.35% (p=0.008 n=5+5) GCD100x100/WithXY-4 19.3µs ± 0% 3.9µs ± 1% -79.58% (p=0.008 n=5+5) GCD100x1000/WithoutXY-4 4.23µs ± 0% 4.17µs ± 3% ~ (p=0.127 n=5+5) GCD100x1000/WithXY-4 22.8µs ± 1% 7.5µs ±10% -67.00% (p=0.008 n=5+5) GCD100x10000/WithoutXY-4 19.1µs ± 0% 19.0µs ± 0% ~ (p=0.095 n=5+5) GCD100x10000/WithXY-4 75.1µs ± 2% 30.5µs ± 2% -59.38% (p=0.008 n=5+5) GCD100x100000/WithoutXY-4 170µs ± 5% 167µs ± 1% ~ (p=1.000 n=5+5) GCD100x100000/WithXY-4 542µs ± 2% 267µs ± 2% -50.79% (p=0.008 n=5+5) GCD1000x1000/WithoutXY-4 28.0µs ± 0% 27.1µs ± 0% -3.29% (p=0.008 n=5+5) GCD1000x1000/WithXY-4 329µs ± 0% 42µs ± 1% -87.12% (p=0.008 n=5+5) GCD1000x10000/WithoutXY-4 47.2µs ± 0% 46.4µs ± 0% -1.65% (p=0.016 n=5+4) GCD1000x10000/WithXY-4 607µs ± 9% 123µs ± 1% -79.70% (p=0.008 n=5+5) GCD1000x100000/WithoutXY-4 260µs ±17% 245µs ± 0% ~ (p=0.056 n=5+5) GCD1000x100000/WithXY-4 3.64ms ± 1% 0.93ms ± 1% -74.41% (p=0.016 n=4+5) GCD10000x10000/WithoutXY-4 513µs ± 0% 507µs ± 0% -1.22% (p=0.008 n=5+5) GCD10000x10000/WithXY-4 7.44ms ± 1% 1.00ms ± 0% -86.58% (p=0.008 n=5+5) GCD10000x100000/WithoutXY-4 1.23ms ± 0% 1.23ms ± 1% ~ (p=0.056 n=5+5) GCD10000x100000/WithXY-4 37.3ms ± 0% 7.3ms ± 1% -80.45% (p=0.008 n=5+5) GCD100000x100000/WithoutXY-4 24.2ms ± 0% 24.2ms ± 0% ~ (p=0.841 n=5+5) GCD100000x100000/WithXY-4 505ms ± 1% 56ms ± 1% -88.92% (p=0.008 n=5+5) Change-Id: I25f42ab8c55033acb83cc32bb03c12c1963925e8 Reviewed-on: https://go-review.googlesource.com/78755 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-05-08 17:24:36 +00:00
Richard Musiol	b00f72e08a	math, math/big: add wasm architecture This commit adds the wasm architecture to the math package. Updates #18892 Change-Id: I5cc38552a31b193d35fb81ae87600a76b8b9e9b5 Reviewed-on: https://go-review.googlesource.com/106996 Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-05-08 13:29:22 +00:00
Martin Möhrmann	8c4170b2c9	math/bits: move tests into their own package This makes math/bits not have any explicit imports even when compiling tests and thereby avoids import cycles when dependencies of testing want to import math/bits. Change-Id: I95eccae2f5c4310e9b18124abfa85212dfbd9daa Reviewed-on: https://go-review.googlesource.com/110479 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-05-01 15:33:01 +00:00
Brian Kessler	6f7ec484f6	math/big: handle negative moduli in ModInverse Currently, there is no check for a negative modulus in ModInverse. Negative moduli are passed internally to GCD, which returns 0 for negative arguments. Mod is symmetric with respect to negative moduli, so the calculation can be done by just negating the modulus before passing the arguments to GCD. Fixes #24949 Change-Id: Ifd1e64c9b2343f0489c04ab65504e73a623378c7 Reviewed-on: https://go-review.googlesource.com/108115 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org>	2018-05-01 00:04:28 +00:00
Brian Kessler	4d44a87243	math/big: return nil for nonexistent ModInverse Currently, the behavior of z.ModInverse(g, n) is undefined when g and n are not relatively prime. In that case, no ModInverse exists which can be easily checked during the computation of the ModInverse. Because the ModInverse does not indicate whether the inverse exists, there are reimplementations of a "checked" ModInverse in crypto/rsa. This change removes the undefined behavior. If the ModInverse does not exist, the receiver z is unchanged and the return value is nil. This matches the behavior of ModSqrt for the case where the square root does not exist. name old time/op new time/op delta ModInverse-4 2.40µs ± 4% 2.22µs ± 0% -7.74% (p=0.016 n=5+4) name old alloc/op new alloc/op delta ModInverse-4 1.36kB ± 0% 1.17kB ± 0% -14.12% (p=0.008 n=5+5) name old allocs/op new allocs/op delta ModInverse-4 10.0 ± 0% 9.0 ± 0% -10.00% (p=0.008 n=5+5) Fixes #24922 Change-Id: If7f9d491858450bdb00f1e317152f02493c9c8a8 Reviewed-on: https://go-review.googlesource.com/108996 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-04-30 23:45:27 +00:00
erifan01	6d5ebc7022	math: add a testcase for Mod and Remainder respectively One might try to implement the Mod or Remainder function with the expression x - TRUNC(x/y + 0.5)*y, but in fact this method is wrong, because the rounding of (x/y + 0.5) to initialize the argument of TRUNC may lose too much precision. However, the current test cases can not detect this error. This CL adds two test cases to prevent people from continuing to do such attempts. Change-Id: I6690f5cffb21bf8ae06a314b7a45cafff8bcee13 Reviewed-on: https://go-review.googlesource.com/84275 Reviewed-by: Robert Griesemer <gri@golang.org>	2018-04-17 03:17:22 +00:00
weeellz	a2ffe3e625	math/rand: refactor rng.go Made constant names more idiomatic, moved some constants to function seedrand, and found better name for _M. Change-Id: I192172f398378bef486a5bbceb6ba86af48ebcc9 Reviewed-on: https://go-review.googlesource.com/107135 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-04-16 22:22:14 +00:00
Cherry Zhang	b08a9b7ecc	all: use new softfloat on GOARM=5 Use the new softfloat support in the compiler, originally added for softfloat on MIPS. This support is portable, so we can just use it for softfloat on ARM. In the old softfloat support on ARM, the compiler generates floating point instructions, then the assembler inserts calls to _sfloat before FP instructions. _sfloat decodes the following FP instructions and simulates them. In the new scheme, the compiler generates runtime calls to do FP operations at a higher level. It doesn't generate FP instructions, and therefore the assembler won't insert _sfloat calls, i.e. the old mechanism is automatically suppressed. The old method may be still be triggered with assembly code using FP instructions. In the standard library, the only occurance is math/sqrt_arm.s, which is rewritten to call to the Go implementation instead. Some significant speedups for code using floating points: name old time/op new time/op delta BinaryTree17-4 37.1s ± 2% 37.3s ± 1% ~ (p=0.105 n=10+10) Fannkuch11-4 13.0s ± 0% 13.1s ± 0% +0.46% (p=0.000 n=10+10) FmtFprintfEmpty-4 700ns ± 4% 734ns ± 6% +4.84% (p=0.009 n=10+10) FmtFprintfString-4 1.22µs ± 3% 1.22µs ± 4% ~ (p=0.897 n=10+10) FmtFprintfInt-4 1.27µs ± 2% 1.30µs ± 1% +1.91% (p=0.001 n=10+9) FmtFprintfIntInt-4 1.83µs ± 2% 1.81µs ± 3% ~ (p=0.149 n=10+10) FmtFprintfPrefixedInt-4 1.80µs ± 3% 1.81µs ± 2% ~ (p=0.421 n=10+8) FmtFprintfFloat-4 6.89µs ± 3% 3.59µs ± 2% -47.93% (p=0.000 n=10+10) FmtManyArgs-4 6.39µs ± 1% 6.09µs ± 1% -4.61% (p=0.000 n=10+9) GobDecode-4 109ms ± 2% 81ms ± 2% -25.99% (p=0.000 n=9+10) GobEncode-4 109ms ± 2% 76ms ± 2% -29.88% (p=0.000 n=10+9) Gzip-4 3.61s ± 1% 3.59s ± 1% ~ (p=0.247 n=10+10) Gunzip-4 449ms ± 4% 450ms ± 1% ~ (p=0.230 n=10+7) HTTPClientServer-4 1.55ms ± 3% 1.53ms ± 2% ~ (p=0.400 n=9+10) JSONEncode-4 356ms ± 1% 183ms ± 1% -48.73% (p=0.000 n=10+10) JSONDecode-4 1.12s ± 2% 0.87s ± 1% -21.88% (p=0.000 n=10+10) Mandelbrot200-4 5.49s ± 1% 2.55s ± 1% -53.45% (p=0.000 n=9+10) GoParse-4 49.6ms ± 2% 47.5ms ± 1% -4.08% (p=0.000 n=10+9) RegexpMatchEasy0_32-4 1.13µs ± 4% 1.20µs ± 4% +6.42% (p=0.000 n=10+10) RegexpMatchEasy0_1K-4 4.41µs ± 2% 4.44µs ± 2% ~ (p=0.128 n=10+10) RegexpMatchEasy1_32-4 1.15µs ± 5% 1.20µs ± 5% +4.85% (p=0.002 n=10+10) RegexpMatchEasy1_1K-4 6.21µs ± 2% 6.37µs ± 4% +2.62% (p=0.001 n=9+10) RegexpMatchMedium_32-4 1.58µs ± 5% 1.65µs ± 3% +4.85% (p=0.000 n=10+10) RegexpMatchMedium_1K-4 341µs ± 3% 351µs ± 7% ~ (p=0.573 n=8+10) RegexpMatchHard_32-4 21.4µs ± 3% 21.5µs ± 5% ~ (p=0.931 n=9+9) RegexpMatchHard_1K-4 626µs ± 2% 626µs ± 1% ~ (p=0.645 n=8+8) Revcomp-4 46.4ms ± 2% 47.4ms ± 2% +2.07% (p=0.000 n=10+10) Template-4 1.31s ± 3% 1.23s ± 4% -6.13% (p=0.000 n=10+10) TimeParse-4 4.49µs ± 1% 4.41µs ± 2% -1.81% (p=0.000 n=10+9) TimeFormat-4 9.31µs ± 1% 9.32µs ± 2% ~ (p=0.561 n=9+9) Change-Id: Iaeeff6c9a09c1b2c064d06e09dd88101dc02bfa4 Reviewed-on: https://go-review.googlesource.com/106735 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-04-13 16:39:39 +00:00
Brian Kessler	7818b82fc8	math/big: clean up z.div(z, x, y) calls Updates #22830 Due to not checking if the output slices alias in divLarge, calls of the form z.div(z, x, y) caused the slice z to attempt to be used to store both the quotient and the remainder of the division. CL 78995 applies an alias check to correct that error. This CL cleans up the additional div calls that attempt to supply the same slice to hold both the quotient and remainder. Note that the call in expNN was responsible for the reported error in r.Exp(x, 1, m) when r was initialized to a non-zero value. The second instance in expNNMontgomery did not result in an error due to the size of the arguments. // RR = 2*(2_Wlen(m)) mod m RR := nat(nil).setWord(1) zz := nat(nil).shl(RR, uint(2numWords_W)) _, RR = RR.div(RR, zz, m) Specifically, cap(RR) == 5 after setWord(1) due to const e = 4 in z.make(1) len(zz) == 2len(m) + 1 after shifting left, numWords = len(m) Reusing the backing array for z and z2 in div was only triggered if cap(RR) >= len(zz) + 1 and len(m) > 1 so that divLarge was called. But, 5 < 2*len(m) + 2 if len(m) > 1, so new arrays were allocated and the error was never triggered in this case. Change-Id: Iedac80dbbde13216c94659e84d28f6f4be3aaf24 Reviewed-on: https://go-review.googlesource.com/81055 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-04-05 22:02:33 +00:00
Carlos Eduardo Seo	fc8967e384	math/big: improve performance on ppc64x by unrolling loops This change improves performance of addVV, subVV and mulAddVWW by unrolling the loops, with improvements up to 1.45x. benchmark old ns/op new ns/op delta BenchmarkAddVV/1-16 5.79 5.85 +1.04% BenchmarkAddVV/2-16 6.41 6.62 +3.28% BenchmarkAddVV/3-16 6.89 7.35 +6.68% BenchmarkAddVV/4-16 7.47 8.26 +10.58% BenchmarkAddVV/5-16 8.04 8.18 +1.74% BenchmarkAddVV/10-16 10.9 11.2 +2.75% BenchmarkAddVV/100-16 81.7 57.0 -30.23% BenchmarkAddVV/1000-16 714 500 -29.97% BenchmarkAddVV/10000-16 7088 4946 -30.22% BenchmarkAddVV/100000-16 71514 49364 -30.97% BenchmarkSubVV/1-16 5.94 5.89 -0.84% BenchmarkSubVV/2-16 12.9 6.82 -47.13% BenchmarkSubVV/3-16 7.03 7.34 +4.41% BenchmarkSubVV/4-16 7.58 8.23 +8.58% BenchmarkSubVV/5-16 8.15 8.19 +0.49% BenchmarkSubVV/10-16 11.2 11.4 +1.79% BenchmarkSubVV/100-16 82.4 57.0 -30.83% BenchmarkSubVV/1000-16 715 499 -30.21% BenchmarkSubVV/10000-16 7089 4947 -30.22% BenchmarkSubVV/100000-16 71568 49378 -31.01% benchmark old MB/s new MB/s speedup BenchmarkAddVV/1-16 11048.49 10939.92 0.99x BenchmarkAddVV/2-16 19973.41 19323.60 0.97x BenchmarkAddVV/3-16 27847.09 26123.06 0.94x BenchmarkAddVV/4-16 34276.46 30976.54 0.90x BenchmarkAddVV/5-16 39781.92 39140.68 0.98x BenchmarkAddVV/10-16 58559.29 56894.68 0.97x BenchmarkAddVV/100-16 78354.88 112243.69 1.43x BenchmarkAddVV/1000-16 89592.74 127889.04 1.43x BenchmarkAddVV/10000-16 90292.39 129387.06 1.43x BenchmarkAddVV/100000-16 89492.92 129647.78 1.45x BenchmarkSubVV/1-16 10781.03 10861.22 1.01x BenchmarkSubVV/2-16 9949.27 18760.21 1.89x BenchmarkSubVV/3-16 27319.40 26166.01 0.96x BenchmarkSubVV/4-16 33764.35 31123.02 0.92x BenchmarkSubVV/5-16 39272.40 39050.31 0.99x BenchmarkSubVV/10-16 57262.87 56206.33 0.98x BenchmarkSubVV/100-16 77641.78 112280.86 1.45x BenchmarkSubVV/1000-16 89486.27 128064.08 1.43x BenchmarkSubVV/10000-16 90274.37 129356.59 1.43x BenchmarkSubVV/100000-16 89424.42 129610.50 1.45x Change-Id: I2795a82134d1e3b75e2634c76b8ca165a723ec7b Reviewed-on: https://go-review.googlesource.com/103495 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>	2018-04-05 20:00:30 +00:00
Robert Griesemer	542ea5ad91	go/printer, gofmt: tuned table alignment for better results The go/printer (and thus gofmt) uses a heuristic to determine whether to break alignment between elements of an expression list which is spread across multiple lines. The heuristic only kicked in if the entry sizes (character length) was above a certain threshold (20) and the ratio between the previous and current entry size was above a certain value (4). This heuristic worked reasonably most of the time, but also led to unfortunate breaks in many cases where a single entry was suddenly much smaller (or larger) then the previous one. The behavior of gofmt was sufficiently mysterious in some of these situations that many issues were filed against it. The simplest solution to address this problem is to remove the heuristic altogether and have a programmer introduce empty lines to force different alignments if it improves readability. The problem with that approach is that the places where it really matters, very long tables with many (hundreds, or more) entries, may be machine-generated and not "post-processed" by a human (e.g., unicode/utf8/tables.go). If a single one of those entries is overlong, the result would be that the alignment would force all comments or values in key:value pairs to be adjusted to that overlong value, making the table hard to read (e.g., that entry may not even be visible on screen and all other entries seem spaced out too wide). Instead, we opted for a slightly improved heuristic that behaves much better for "normal", human-written code. 1) The threshold is increased from 20 to 40. This disables the heuristic for many common cases yet even if the alignment is not "ideal", 40 is not that many characters per line with todays screens, making it very likely that the entire line remains "visible" in an editor. 2) Changed the heuristic to not simply look at the size ratio between current and previous line, but instead considering the geometric mean of the sizes of the previous (aligned) lines. This emphasizes the "overall picture" of the previous lines, rather than a single one (which might be an outlier). 3) Changed the ratio from 4 to 2.5. Now that we ignore sizes below 40, a ratio of 4 would mean that a new entry would have to be 4 times bigger (160) or smaller (10) before alignment would be broken. A ratio of 2.5 seems more sensible. Applied updated gofmt to all of src and misc. Also tested against several former issues that complained about this and verified that the output for the given examples is satisfactory (added respective test cases). Some of the files changed because they were not gofmt-ed in the first place. For #644. For #7335. For #10392. (and probably more related issues) Fixes #22852. Change-Id: I5e48b3d3b157a5cf2d649833b7297b33f43a6f6e	2018-04-04 13:39:34 -07:00
isharipo	02952ad7a8	math/big: remove "else" from if with block that ends with return That "else" was needed due to gc DCE limitations. Now it's not the case and we can avoid go lint complaints. (See #23521 and https://golang.org/cl/91056.) There is inlining test for bigEndianWord, so if test is passing, no performance regression should occur. Change-Id: Id84d63f361e5e51a52293904ff042966c83c16e9 Reviewed-on: https://go-review.googlesource.com/104555 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>	2018-04-03 19:55:04 +00:00
Michael Munday	32e6461dc6	cmd/asm, math: add s390x floating point test instructions Floating point test instructions allow special cases (NaN, ±∞ and a few other useful properties) to be checked directly. This CL adds the following instructions to the assembler: * LTEBR - load and test (float32) * LTDBR - load and test (float64) * TCEB - test data class (float32) * TCDB - test data class (float64) Note that I have only added immediate versions of the 'test data class' instructions for now as that's the only case I think the compiler will use. Change-Id: I3398aab2b3a758bf909bd158042234030c8af582 Reviewed-on: https://go-review.googlesource.com/104457 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-04-03 16:08:04 +00:00
Wèi Cōngruì	4b265fb747	math: fix Ldexp when result is below ldexp(2, -1075) Before this change, the smallest result Ldexp can handle was ldexp(2, -1075), which is SmallestNonzeroFloat64. There are some numbers below it should also be rounded to SmallestNonzeroFloat64. The change fixes this. Fixes #23407 Change-Id: I76f4cb005a6e9ccdd95b5e5c734079fd5d29e4aa Reviewed-on: https://go-review.googlesource.com/87338 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-03-29 23:14:13 +00:00
Wèi Cōngruì	7fe2f549cc	math: handle denormals in AMD64 Exp Fixes #23164 Change-Id: I6e8c6443f3ef91df71e117cce1cfa1faba647dd7 Reviewed-on: https://go-review.googlesource.com/87337 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-29 07:03:09 +00:00
erifan01	711a373cc3	math: optimize Exp and Exp2 on arm64 This CL implements Exp and Exp2 with arm64 assembly. By inlining Ldexp and using fused instructions(fmadd, fmsub, fnmsub), this CL helps to improve the performance of functions Exp, Exp2, Sinh, Cosh and Tanh. Benchmarks: name old time/op new time/op delta Cosh-8 138ns ± 0% 96ns ± 0% -30.72% (p=0.008 n=5+5) Exp-8 105ns ± 0% 58ns ± 0% -45.24% (p=0.000 n=5+4) Exp2-8 100ns ± 0% 57ns ± 0% -43.21% (p=0.008 n=5+5) Sinh-8 139ns ± 0% 102ns ± 0% -26.62% (p=0.008 n=5+5) Tanh-8 134ns ± 0% 100ns ± 0% -25.67% (p=0.008 n=5+5) Change-Id: I7483a3333062a1d3525cedf3de56db78d79031c6 Reviewed-on: https://go-review.googlesource.com/86615 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-03-27 19:55:02 +00:00
Carlos Eduardo Seo	a44c72823c	math/big: improve performance of addVW/subVW for ppc64x This change adds a better implementation in asm for addVW/subVW for ppc64x, with speedups up to 3.11x. benchmark old ns/op new ns/op delta BenchmarkAddVW/1-16 6.87 5.71 -16.89% BenchmarkAddVW/2-16 7.72 5.94 -23.06% BenchmarkAddVW/3-16 8.74 6.56 -24.94% BenchmarkAddVW/4-16 9.66 7.26 -24.84% BenchmarkAddVW/5-16 10.8 7.26 -32.78% BenchmarkAddVW/10-16 17.4 9.97 -42.70% BenchmarkAddVW/100-16 164 56.0 -65.85% BenchmarkAddVW/1000-16 1638 524 -68.01% BenchmarkAddVW/10000-16 16421 5201 -68.33% BenchmarkAddVW/100000-16 165762 53324 -67.83% BenchmarkSubVW/1-16 6.76 5.62 -16.86% BenchmarkSubVW/2-16 7.69 6.02 -21.72% BenchmarkSubVW/3-16 8.85 6.61 -25.31% BenchmarkSubVW/4-16 10.0 7.34 -26.60% BenchmarkSubVW/5-16 11.3 7.33 -35.13% BenchmarkSubVW/10-16 19.5 18.7 -4.10% BenchmarkSubVW/100-16 153 55.9 -63.46% BenchmarkSubVW/1000-16 1502 519 -65.45% BenchmarkSubVW/10000-16 15005 5165 -65.58% BenchmarkSubVW/100000-16 150620 53124 -64.73% benchmark old MB/s new MB/s speedup BenchmarkAddVW/1-16 1165.12 1400.76 1.20x BenchmarkAddVW/2-16 2071.39 2693.25 1.30x BenchmarkAddVW/3-16 2744.72 3656.92 1.33x BenchmarkAddVW/4-16 3311.63 4407.34 1.33x BenchmarkAddVW/5-16 3700.52 5512.48 1.49x BenchmarkAddVW/10-16 4605.63 8026.37 1.74x BenchmarkAddVW/100-16 4856.15 14296.76 2.94x BenchmarkAddVW/1000-16 4883.96 15264.21 3.13x BenchmarkAddVW/10000-16 4871.52 15380.78 3.16x BenchmarkAddVW/100000-16 4826.17 15002.48 3.11x BenchmarkSubVW/1-16 1183.20 1423.03 1.20x BenchmarkSubVW/2-16 2081.92 2657.44 1.28x BenchmarkSubVW/3-16 2711.52 3632.30 1.34x BenchmarkSubVW/4-16 3198.30 4360.30 1.36x BenchmarkSubVW/5-16 3534.43 5460.40 1.54x BenchmarkSubVW/10-16 4106.34 4273.51 1.04x BenchmarkSubVW/100-16 5213.48 14306.32 2.74x BenchmarkSubVW/1000-16 5324.27 15391.21 2.89x BenchmarkSubVW/10000-16 5331.33 15486.57 2.90x BenchmarkSubVW/100000-16 5311.35 15059.01 2.84x Change-Id: Ibaa5b9b38d63fba8e01a9c327eb8bef1e6e908c1 Reviewed-on: https://go-review.googlesource.com/101975 Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>	2018-03-27 15:06:53 +00:00
Vlad Krasnov	26d74e8b65	math/big: reduce amount of copying in Montgomery multiplication Instead shifting the accumulator every iteration of the loop, shift once in the end. This significantly improves performance on arm64. On arm64: name old time/op new time/op delta RSA2048Decrypt 3.33ms ± 0% 2.63ms ± 0% -20.94% (p=0.000 n=11+11) RSA2048Sign 4.22ms ± 0% 3.55ms ± 0% -15.89% (p=0.000 n=11+11) 3PrimeRSA2048Decrypt 1.95ms ± 0% 1.59ms ± 0% -18.59% (p=0.000 n=11+11) On Skylake: name old time/op new time/op delta RSA2048Decrypt-8 1.73ms ± 2% 1.55ms ± 2% -10.19% (p=0.000 n=10+10) RSA2048Sign-8 2.17ms ± 2% 2.00ms ± 2% -7.93% (p=0.000 n=10+10) 3PrimeRSA2048Decrypt-8 1.10ms ± 2% 0.96ms ± 2% -13.03% (p=0.000 n=10+9) Change-Id: I5786191a1a09e4217fdb1acfd90880d35c5855f7 Reviewed-on: https://go-review.googlesource.com/99838 Run-TryBot: Vlad Krasnov <vlad@cloudflare.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Adam Langley <agl@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-03-19 21:40:56 +00:00
Alberto Donizetti	168cc7ff9c	math/big: add 0 shift fastpath to shl and shr One could expect calls like z.mant.shl(z.mant, shiftAmount) (or higher-level-functions calls that use lhs/rhs) to be almost free when shiftAmount = 0; and expect calls like z.mant.shl(x.mant, 0) to have the same cost of a x.mant -> z.mant copy. Neither of this things are currently true. For an 800 words nat, the first kind of calls cost ~800ns for rigth shifts and ~3.5µs for left shift; while the second kind of calls are doing more work than necessary by calling shlVU/shrVU. This change makes the first kind of calls ({Shl,Shr}Same) almost free, and the second kind of calls ({Shl,Shr}) about 30% faster. name old time/op new time/op delta ZeroShifts/Shl-4 3.64µs ± 3% 2.49µs ± 1% -31.55% (p=0.000 n=10+10) ZeroShifts/ShlSame-4 3.65µs ± 1% 0.01µs ± 1% -99.85% (p=0.000 n=9+9) ZeroShifts/Shr-4 3.65µs ± 1% 2.49µs ± 1% -31.91% (p=0.000 n=10+10) ZeroShifts/ShrSame-4 825ns ± 0% 6ns ± 1% -99.33% (p=0.000 n=9+10) During go test math/big, the shl zeroshift fastpath is triggered 1380 times; while the shr fastpath is triggered 153334 times(!). Change-Id: I5f92b304a40638bd8453a86c87c58e54b337bcdf Reviewed-on: https://go-review.googlesource.com/87660 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2018-03-19 18:01:37 +00:00
Robert Griesemer	7d4d2cb686	math/big: add comment about internal assumptions on nat values Change-Id: I7ed40507a019c0bf521ba748fc22c03d74bb17b7 Reviewed-on: https://go-review.googlesource.com/100719 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-03-14 18:31:05 +00:00
erifan01	140bfe9cfe	math/big: optimize shlVU and shrVU on arm64 This CL implements shlVU and shrVU with arm64 HW instructions "LDP" and "STP" to reduce load cost, it also removes unnecessary checks on the number of shifts for better performance. Benchmarks: name old time/op new time/op delta AddVV/1-8 21.6ns ± 1% 21.6ns ± 1% ~ (p=0.683 n=5+5) AddVV/2-8 13.5ns ± 0% 13.5ns ± 0% ~ (all equal) AddVV/3-8 15.5ns ± 0% 15.5ns ± 0% ~ (all equal) AddVV/4-8 17.5ns ± 0% 17.5ns ± 0% ~ (all equal) AddVV/5-8 19.5ns ± 0% 19.5ns ± 0% ~ (all equal) AddVV/10-8 29.5ns ± 0% 29.5ns ± 0% ~ (all equal) AddVV/100-8 217ns ± 0% 217ns ± 0% ~ (all equal) AddVV/1000-8 2.02µs ± 0% 2.03µs ± 0% +0.73% (p=0.008 n=5+5) AddVV/10000-8 20.5µs ± 0% 20.5µs ± 0% -0.01% (p=0.008 n=5+5) AddVV/100000-8 246µs ± 5% 250µs ± 4% ~ (p=0.548 n=5+5) AddVW/1-8 9.26ns ± 0% 9.32ns ± 0% +0.65% (p=0.016 n=4+5) AddVW/2-8 19.8ns ± 3% 19.8ns ± 0% ~ (p=0.143 n=5+5) AddVW/3-8 11.5ns ± 0% 11.5ns ± 0% ~ (all equal) AddVW/4-8 13.0ns ± 0% 13.0ns ± 0% ~ (all equal) AddVW/5-8 14.5ns ± 0% 14.5ns ± 0% ~ (all equal) AddVW/10-8 22.0ns ± 0% 22.0ns ± 0% ~ (all equal) AddVW/100-8 167ns ± 0% 166ns ± 0% -0.60% (p=0.000 n=5+4) AddVW/1000-8 1.52µs ± 0% 1.52µs ± 0% ~ (all equal) AddVW/10000-8 15.1µs ± 0% 15.1µs ± 0% +0.01% (p=0.008 n=5+5) AddVW/100000-8 163µs ± 4% 153µs ± 3% -5.97% (p=0.016 n=5+5) AddMulVVW/1-8 32.4ns ± 1% 33.0ns ± 1% +1.73% (p=0.040 n=5+5) AddMulVVW/2-8 56.4ns ± 2% 55.9ns ± 1% ~ (p=0.135 n=5+5) AddMulVVW/3-8 85.4ns ± 1% 85.1ns ± 0% ~ (p=0.079 n=5+5) AddMulVVW/4-8 129ns ± 1% 129ns ± 0% ~ (p=0.397 n=5+5) AddMulVVW/5-8 148ns ± 0% 148ns ± 0% ~ (all equal) AddMulVVW/10-8 270ns ± 0% 268ns ± 0% -0.74% (p=0.029 n=4+4) AddMulVVW/100-8 2.75µs ± 0% 2.75µs ± 0% -0.09% (p=0.008 n=5+5) AddMulVVW/1000-8 26.0µs ± 0% 26.0µs ± 0% -0.06% (p=0.024 n=5+5) AddMulVVW/10000-8 312µs ± 0% 312µs ± 0% -0.09% (p=0.008 n=5+5) AddMulVVW/100000-8 2.89ms ± 0% 2.89ms ± 0% +0.14% (p=0.016 n=5+5) DecimalConversion-8 315µs ± 1% 312µs ± 0% ~ (p=0.095 n=5+5) FloatString/100-8 2.56µs ± 1% 2.52µs ± 1% -1.31% (p=0.016 n=5+5) FloatString/1000-8 58.6µs ± 0% 58.2µs ± 0% -0.75% (p=0.008 n=5+5) FloatString/10000-8 4.59ms ± 0% 4.59ms ± 0% ~ (p=0.056 n=5+5) FloatString/100000-8 446ms ± 0% 446ms ± 0% -0.04% (p=0.008 n=5+5) FloatAdd/10-8 184ns ± 0% 178ns ± 0% -3.48% (p=0.008 n=5+5) FloatAdd/100-8 189ns ± 3% 178ns ± 2% -6.02% (p=0.008 n=5+5) FloatAdd/1000-8 371ns ± 0% 267ns ± 0% -27.99% (p=0.000 n=5+4) FloatAdd/10000-8 1.87µs ± 0% 1.03µs ± 0% -44.74% (p=0.008 n=5+5) FloatAdd/100000-8 17.1µs ± 0% 8.8µs ± 0% -48.71% (p=0.016 n=5+4) FloatSub/10-8 148ns ± 0% 146ns ± 0% -1.35% (p=0.000 n=5+4) FloatSub/100-8 148ns ± 0% 140ns ± 0% -5.41% (p=0.008 n=5+5) FloatSub/1000-8 242ns ± 0% 191ns ± 0% -21.24% (p=0.008 n=5+5) FloatSub/10000-8 1.07µs ± 0% 0.64µs ± 1% -39.89% (p=0.016 n=4+5) FloatSub/100000-8 9.48µs ± 0% 5.32µs ± 0% -43.87% (p=0.008 n=5+5) ParseFloatSmallExp-8 29.3µs ± 3% 28.6µs ± 1% ~ (p=0.310 n=5+5) ParseFloatLargeExp-8 125µs ± 1% 123µs ± 0% -1.99% (p=0.008 n=5+5) GCD10x10/WithoutXY-8 278ns ± 4% 289ns ± 5% +3.96% (p=0.040 n=5+5) GCD10x10/WithXY-8 2.12µs ± 2% 2.15µs ± 2% ~ (p=0.095 n=5+5) GCD10x100/WithoutXY-8 615ns ± 1% 629ns ± 5% ~ (p=0.135 n=5+5) GCD10x100/WithXY-8 3.42µs ± 1% 3.53µs ± 2% +3.38% (p=0.008 n=5+5) GCD10x1000/WithoutXY-8 1.39µs ± 1% 1.38µs ± 1% ~ (p=0.460 n=5+5) GCD10x1000/WithXY-8 7.47µs ± 2% 7.49µs ± 3% ~ (p=1.000 n=5+5) GCD10x10000/WithoutXY-8 8.71µs ± 1% 8.71µs ± 0% ~ (p=0.841 n=5+5) GCD10x10000/WithXY-8 28.4µs ± 2% 27.2µs ± 2% -4.24% (p=0.008 n=5+5) GCD10x100000/WithoutXY-8 78.9µs ± 1% 79.1µs ± 0% ~ (p=0.222 n=5+5) GCD10x100000/WithXY-8 240µs ± 1% 228µs ± 1% -4.98% (p=0.008 n=5+5) GCD100x100/WithoutXY-8 1.87µs ± 2% 1.89µs ± 1% ~ (p=0.095 n=5+5) GCD100x100/WithXY-8 26.6µs ± 1% 26.3µs ± 0% -1.14% (p=0.032 n=5+5) GCD100x1000/WithoutXY-8 4.44µs ± 2% 4.47µs ± 2% ~ (p=0.444 n=5+5) GCD100x1000/WithXY-8 36.7µs ± 1% 36.0µs ± 1% -1.96% (p=0.008 n=5+5) GCD100x10000/WithoutXY-8 22.8µs ± 1% 22.3µs ± 1% -2.52% (p=0.008 n=5+5) GCD100x10000/WithXY-8 145µs ± 1% 142µs ± 0% -1.88% (p=0.008 n=5+5) GCD100x100000/WithoutXY-8 198µs ± 1% 190µs ± 0% -4.06% (p=0.008 n=5+5) GCD100x100000/WithXY-8 1.11ms ± 0% 1.09ms ± 0% -1.87% (p=0.008 n=5+5) GCD1000x1000/WithoutXY-8 25.4µs ± 1% 25.0µs ± 1% -1.34% (p=0.008 n=5+5) GCD1000x1000/WithXY-8 515µs ± 1% 485µs ± 0% -5.85% (p=0.008 n=5+5) GCD1000x10000/WithoutXY-8 57.3µs ± 1% 56.2µs ± 1% -1.95% (p=0.008 n=5+5) GCD1000x10000/WithXY-8 1.21ms ± 0% 1.18ms ± 0% -2.65% (p=0.008 n=5+5) GCD1000x100000/WithoutXY-8 358µs ± 0% 352µs ± 1% -1.71% (p=0.008 n=5+5) GCD1000x100000/WithXY-8 8.72ms ± 0% 8.66ms ± 0% -0.71% (p=0.008 n=5+5) GCD10000x10000/WithoutXY-8 690µs ± 0% 687µs ± 1% ~ (p=0.095 n=5+5) GCD10000x10000/WithXY-8 16.0ms ± 0% 12.5ms ± 0% -22.01% (p=0.008 n=5+5) GCD10000x100000/WithoutXY-8 2.09ms ± 0% 2.07ms ± 0% -0.58% (p=0.008 n=5+5) GCD10000x100000/WithXY-8 86.8ms ± 0% 83.4ms ± 0% -3.95% (p=0.008 n=5+5) GCD100000x100000/WithoutXY-8 51.2ms ± 0% 51.2ms ± 0% ~ (p=0.548 n=5+5) GCD100000x100000/WithXY-8 1.25s ± 0% 0.89s ± 0% -28.98% (p=0.008 n=5+5) Hilbert-8 2.46ms ± 2% 2.53ms ± 1% +2.89% (p=0.032 n=5+5) Binomial-8 5.15µs ± 4% 4.92µs ± 1% -4.43% (p=0.032 n=5+5) QuoRem-8 7.10µs ± 0% 7.05µs ± 0% -0.59% (p=0.008 n=5+5) Exp-8 161ms ± 0% 161ms ± 0% -0.24% (p=0.008 n=5+5) Exp2-8 161ms ± 0% 161ms ± 0% -0.30% (p=0.016 n=4+5) Bitset-8 40.4ns ± 0% 40.3ns ± 0% ~ (p=0.159 n=5+5) BitsetNeg-8 158ns ± 4% 155ns ± 2% ~ (p=0.183 n=5+5) BitsetOrig-8 374ns ± 0% 383ns ± 1% +2.35% (p=0.008 n=5+5) BitsetNegOrig-8 620ns ± 1% 663ns ± 2% +7.00% (p=0.008 n=5+5) ModSqrt225_Tonelli-8 7.26ms ± 0% 7.27ms ± 0% ~ (p=0.841 n=5+5) ModSqrt224_3Mod4-8 2.24ms ± 0% 2.24ms ± 0% ~ (p=0.690 n=5+5) ModSqrt5430_Tonelli-8 62.3s ± 0% 62.4s ± 0% +0.15% (p=0.008 n=5+5) ModSqrt5430_3Mod4-8 20.8s ± 0% 20.8s ± 0% ~ (p=0.151 n=5+5) Sqrt-8 101µs ± 0% 97µs ± 0% -3.99% (p=0.008 n=5+5) IntSqr/1-8 32.7ns ± 1% 32.5ns ± 1% ~ (p=0.325 n=5+5) IntSqr/2-8 161ns ± 4% 160ns ± 4% ~ (p=0.659 n=5+5) IntSqr/3-8 296ns ± 7% 297ns ± 6% ~ (p=0.841 n=5+5) IntSqr/5-8 752ns ± 7% 755ns ± 6% ~ (p=0.889 n=5+5) IntSqr/8-8 1.91µs ± 3% 1.90µs ± 3% ~ (p=0.746 n=5+5) IntSqr/10-8 2.99µs ± 4% 3.00µs ± 4% ~ (p=0.516 n=5+5) IntSqr/20-8 6.29µs ± 2% 6.19µs ± 2% ~ (p=0.151 n=5+5) IntSqr/30-8 14.0µs ± 1% 13.8µs ± 2% ~ (p=0.056 n=5+5) IntSqr/50-8 38.1µs ± 3% 37.9µs ± 3% ~ (p=0.548 n=5+5) IntSqr/80-8 95.1µs ± 1% 94.7µs ± 1% ~ (p=0.310 n=5+5) IntSqr/100-8 148µs ± 1% 148µs ± 1% ~ (p=0.548 n=5+5) IntSqr/200-8 587µs ± 1% 587µs ± 1% ~ (p=1.000 n=5+5) IntSqr/300-8 1.31ms ± 1% 1.32ms ± 1% ~ (p=0.151 n=5+5) IntSqr/500-8 2.48ms ± 0% 2.49ms ± 0% ~ (p=0.310 n=5+5) IntSqr/800-8 4.68ms ± 0% 4.67ms ± 0% ~ (p=0.548 n=5+5) IntSqr/1000-8 7.57ms ± 0% 7.56ms ± 0% ~ (p=0.421 n=5+5) Mul-8 311ms ± 0% 311ms ± 0% ~ (p=0.151 n=5+5) Exp3Power/0x10-8 584ns ± 2% 573ns ± 1% ~ (p=0.190 n=5+5) Exp3Power/0x40-8 646ns ± 2% 649ns ± 1% ~ (p=0.690 n=5+5) Exp3Power/0x100-8 1.42µs ± 2% 1.45µs ± 1% +2.03% (p=0.032 n=5+5) Exp3Power/0x400-8 8.28µs ± 1% 8.39µs ± 0% +1.33% (p=0.008 n=5+5) Exp3Power/0x1000-8 60.1µs ± 0% 59.8µs ± 0% -0.44% (p=0.008 n=5+5) Exp3Power/0x4000-8 818µs ± 0% 816µs ± 0% -0.23% (p=0.008 n=5+5) Exp3Power/0x10000-8 7.79ms ± 0% 7.78ms ± 0% ~ (p=0.690 n=5+5) Exp3Power/0x40000-8 73.4ms ± 0% 73.3ms ± 0% ~ (p=0.151 n=5+5) Exp3Power/0x100000-8 665ms ± 0% 664ms ± 0% -0.16% (p=0.016 n=4+5) Exp3Power/0x400000-8 5.99s ± 0% 5.97s ± 0% -0.24% (p=0.008 n=5+5) Fibo-8 116ms ± 0% 117ms ± 0% +0.42% (p=0.008 n=5+5) NatSqr/1-8 113ns ± 2% 112ns ± 1% ~ (p=0.190 n=5+5) NatSqr/2-8 249ns ± 2% 250ns ± 2% ~ (p=0.365 n=5+5) NatSqr/3-8 379ns ± 1% 381ns ± 2% ~ (p=0.127 n=5+5) NatSqr/5-8 838ns ± 3% 841ns ± 5% ~ (p=0.754 n=5+5) NatSqr/8-8 1.97µs ± 3% 1.97µs ± 4% ~ (p=1.000 n=5+5) NatSqr/10-8 3.04µs ± 4% 3.04µs ± 4% ~ (p=1.000 n=5+5) NatSqr/20-8 6.49µs ± 3% 6.50µs ± 2% ~ (p=0.841 n=5+5) NatSqr/30-8 14.3µs ± 2% 14.2µs ± 2% ~ (p=0.548 n=5+5) NatSqr/50-8 38.5µs ± 3% 38.3µs ± 3% ~ (p=0.421 n=5+5) NatSqr/80-8 96.3µs ± 1% 96.1µs ± 1% ~ (p=0.421 n=5+5) NatSqr/100-8 149µs ± 1% 148µs ± 1% ~ (p=0.310 n=5+5) NatSqr/200-8 591µs ± 1% 592µs ± 1% ~ (p=0.690 n=5+5) NatSqr/300-8 1.31ms ± 1% 1.32ms ± 0% ~ (p=0.190 n=5+4) NatSqr/500-8 2.49ms ± 0% 2.49ms ± 0% ~ (p=0.095 n=5+5) NatSqr/800-8 4.70ms ± 0% 4.69ms ± 0% ~ (p=0.222 n=5+5) NatSqr/1000-8 7.60ms ± 0% 7.58ms ± 0% ~ (p=0.222 n=5+5) ScanPi-8 326µs ± 0% 327µs ± 1% ~ (p=0.222 n=5+5) StringPiParallel-8 71.4µs ± 5% 67.7µs ± 4% ~ (p=0.095 n=5+5) Scan/10/Base2-8 1.09µs ± 0% 1.10µs ± 1% ~ (p=0.810 n=5+5) Scan/100/Base2-8 7.79µs ± 0% 7.83µs ± 0% +0.53% (p=0.008 n=5+5) Scan/1000/Base2-8 78.9µs ± 0% 79.0µs ± 0% ~ (p=0.151 n=5+5) Scan/10000/Base2-8 1.22ms ± 0% 1.23ms ± 1% ~ (p=0.690 n=5+5) Scan/100000/Base2-8 55.1ms ± 0% 55.1ms ± 0% +0.10% (p=0.008 n=5+5) Scan/10/Base8-8 512ns ± 1% 534ns ± 1% +4.34% (p=0.008 n=5+5) Scan/100/Base8-8 2.90µs ± 1% 2.92µs ± 0% +0.67% (p=0.024 n=5+5) Scan/1000/Base8-8 31.0µs ± 0% 31.1µs ± 0% +0.27% (p=0.008 n=5+5) Scan/10000/Base8-8 741µs ± 0% 744µs ± 1% ~ (p=0.310 n=5+5) Scan/100000/Base8-8 50.5ms ± 0% 50.7ms ± 0% +0.23% (p=0.016 n=5+4) Scan/10/Base10-8 485ns ± 0% 510ns ± 1% +5.15% (p=0.008 n=5+5) Scan/100/Base10-8 2.68µs ± 0% 2.70µs ± 0% +0.84% (p=0.008 n=5+5) Scan/1000/Base10-8 28.7µs ± 0% 28.8µs ± 0% +0.34% (p=0.008 n=5+5) Scan/10000/Base10-8 717µs ± 0% 720µs ± 1% ~ (p=0.238 n=5+5) Scan/100000/Base10-8 50.3ms ± 0% 50.3ms ± 0% +0.02% (p=0.016 n=4+5) Scan/10/Base16-8 439ns ± 0% 461ns ± 1% +5.06% (p=0.008 n=5+5) Scan/100/Base16-8 2.48µs ± 0% 2.49µs ± 0% +0.59% (p=0.024 n=5+5) Scan/1000/Base16-8 27.2µs ± 0% 27.3µs ± 0% ~ (p=0.063 n=5+5) Scan/10000/Base16-8 722µs ± 0% 725µs ± 1% ~ (p=0.421 n=5+5) Scan/100000/Base16-8 52.7ms ± 0% 52.7ms ± 0% ~ (p=0.686 n=4+4) String/10/Base2-8 248ns ± 1% 248ns ± 1% ~ (p=0.802 n=5+5) String/100/Base2-8 1.51µs ± 0% 1.51µs ± 0% -0.54% (p=0.024 n=5+5) String/1000/Base2-8 13.6µs ± 0% 13.6µs ± 0% ~ (p=0.548 n=5+5) String/10000/Base2-8 135µs ± 1% 135µs ± 2% ~ (p=0.421 n=5+5) String/100000/Base2-8 1.32ms ± 1% 1.33ms ± 1% ~ (p=0.310 n=5+5) String/10/Base8-8 169ns ± 0% 170ns ± 0% ~ (p=0.079 n=5+5) String/100/Base8-8 635ns ± 1% 633ns ± 1% ~ (p=0.595 n=5+5) String/1000/Base8-8 5.33µs ± 0% 5.30µs ± 0% ~ (p=0.063 n=5+5) String/10000/Base8-8 50.7µs ± 1% 50.7µs ± 1% ~ (p=1.000 n=5+5) String/100000/Base8-8 499µs ± 1% 500µs ± 1% ~ (p=1.000 n=5+5) String/10/Base10-8 517ns ± 1% 512ns ± 1% -1.01% (p=0.032 n=5+5) String/100/Base10-8 1.97µs ± 0% 2.01µs ± 1% +2.13% (p=0.008 n=5+5) String/1000/Base10-8 12.6µs ± 1% 12.1µs ± 1% -4.16% (p=0.008 n=5+5) String/10000/Base10-8 57.9µs ± 1% 54.8µs ± 1% -5.46% (p=0.008 n=5+5) String/100000/Base10-8 25.6ms ± 0% 25.6ms ± 0% -0.12% (p=0.008 n=5+5) String/10/Base16-8 149ns ± 0% 149ns ± 1% ~ (p=1.000 n=5+5) String/100/Base16-8 514ns ± 0% 514ns ± 1% ~ (p=0.825 n=5+5) String/1000/Base16-8 4.01µs ± 0% 4.01µs ± 0% ~ (p=0.595 n=5+5) String/10000/Base16-8 37.7µs ± 0% 37.8µs ± 1% ~ (p=0.222 n=5+5) String/100000/Base16-8 373µs ± 1% 372µs ± 0% ~ (p=1.000 n=5+5) LeafSize/0-8 6.64ms ± 0% 6.66ms ± 0% +0.32% (p=0.008 n=5+5) LeafSize/1-8 74.0µs ± 1% 71.2µs ± 1% -3.75% (p=0.008 n=5+5) LeafSize/2-8 74.1µs ± 0% 70.7µs ± 1% -4.53% (p=0.008 n=5+5) LeafSize/3-8 379µs ± 0% 374µs ± 0% -1.25% (p=0.008 n=5+5) LeafSize/4-8 72.7µs ± 0% 69.2µs ± 0% -4.79% (p=0.008 n=5+5) LeafSize/5-8 471µs ± 0% 466µs ± 0% -1.05% (p=0.008 n=5+5) LeafSize/6-8 377µs ± 0% 373µs ± 0% -1.16% (p=0.008 n=5+5) LeafSize/7-8 245µs ± 0% 241µs ± 0% -1.65% (p=0.008 n=5+5) LeafSize/8-8 73.1µs ± 0% 69.4µs ± 0% -5.10% (p=0.008 n=5+5) LeafSize/9-8 538µs ± 0% 532µs ± 0% -1.01% (p=0.008 n=5+5) LeafSize/10-8 472µs ± 0% 467µs ± 0% -1.07% (p=0.008 n=5+5) LeafSize/11-8 460µs ± 0% 454µs ± 0% -1.22% (p=0.008 n=5+5) LeafSize/12-8 378µs ± 0% 373µs ± 0% -1.34% (p=0.008 n=5+5) LeafSize/13-8 344µs ± 0% 338µs ± 0% -1.61% (p=0.008 n=5+5) LeafSize/14-8 247µs ± 0% 243µs ± 0% -1.62% (p=0.008 n=5+5) LeafSize/15-8 169µs ± 0% 165µs ± 0% -2.71% (p=0.008 n=5+5) LeafSize/16-8 73.3µs ± 1% 69.5µs ± 0% -5.11% (p=0.008 n=5+5) LeafSize/32-8 82.7µs ± 0% 79.2µs ± 0% -4.24% (p=0.008 n=5+5) LeafSize/64-8 135µs ± 0% 132µs ± 0% -2.20% (p=0.008 n=5+5) ProbablyPrime/n=0-8 44.2ms ± 0% 43.9ms ± 0% -0.69% (p=0.008 n=5+5) ProbablyPrime/n=1-8 64.8ms ± 0% 64.4ms ± 0% -0.60% (p=0.008 n=5+5) ProbablyPrime/n=5-8 147ms ± 0% 147ms ± 0% -0.34% (p=0.008 n=5+5) ProbablyPrime/n=10-8 250ms ± 0% 249ms ± 0% -0.29% (p=0.008 n=5+5) ProbablyPrime/n=20-8 456ms ± 0% 455ms ± 0% -0.29% (p=0.008 n=5+5) ProbablyPrime/Lucas-8 23.6ms ± 0% 23.2ms ± 0% -1.44% (p=0.008 n=5+5) ProbablyPrime/MillerRabinBase2-8 20.6ms ± 0% 20.6ms ± 0% -0.31% (p=0.008 n=5+5) FloatSqrt/64-8 2.27µs ± 1% 2.11µs ± 1% -7.02% (p=0.008 n=5+5) FloatSqrt/128-8 4.93µs ± 1% 4.40µs ± 1% -10.73% (p=0.008 n=5+5) FloatSqrt/256-8 13.6µs ± 0% 6.6µs ± 1% -51.40% (p=0.008 n=5+5) FloatSqrt/1000-8 69.8µs ± 0% 31.2µs ± 0% -55.27% (p=0.008 n=5+5) FloatSqrt/10000-8 1.91ms ± 0% 0.59ms ± 0% -69.17% (p=0.008 n=5+5) FloatSqrt/100000-8 55.4ms ± 0% 17.8ms ± 0% -67.79% (p=0.008 n=5+5) FloatSqrt/1000000-8 4.56s ± 0% 1.52s ± 0% -66.59% (p=0.008 n=5+5) Change-Id: Icce52c69668f564490c69b908338b21a2288e116 Reviewed-on: https://go-review.googlesource.com/79355 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-03-12 14:42:11 +00:00

1 2 3 4 5 ...

439 Commits